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PREFACE 


During the second World War, scientific methods of personnel 
selection were applied in the Royal Navy, the Army and A.T.S., 
and the Royal Air Force on a scale quite unprecedented in 
Britain. This development has naturally aroused widespread inter¬ 
est, and a desire for further information beyond that provided by 
artides in the popular Press or in technical'psychological journals, 
The present book attempts, therefore, to give an account of the pro¬ 
cedures used and the results obtained. It is addressed both to the 
industrialist or educationist who hopes that the methods may be of 
value in the peace-time selection of employees, students or pupilS) 
and to the student of psychology or to the personnel officer who 
wishes to know what psychological discoveries were made and 
what techniques were found most valuable. If for this reason it 
falls between two stools, we must apologise; it is not easy to write 
a simple, readable and interesting description for the former and 
at the same time to give the latter all the technical details that he 
needs. Statistical terms, formulae and calculations have been kept 
to the barest minimum and will, it is hoped, be published elsewhere. 

Our primary aim has been to describe such novel contributions 
as were made in the Fighting Services. But although space has not 
allowed a comprehensive survey of peace-time selection and guid¬ 
ance, we have tried to view our methods and results in this broader 
setting, and todntegrate them with recent developments in civilian 
vocational and educational psychology. 

After an historical introduction, Fart I is concerned with the 
organisation of selection, the general procedures employed, and the 
work of psychologists in the Royal Navy, Army and A.T.S., and 
the R,A.F. Part II takes up the principles of selection and guidance 
which have evolved both from pre-war investigations and from 
' war-time experience. It reviews the techniques of intervieviring and 
testing, together with evidence of their merits and defects, in some 
detail. Each chapter is preceded by a general summary or abstract, 
and the main conclusions relevant to peace-time work are brought 
together in the final chapter. 

It should be made clear that two important topicjs are almost 
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wholly neglected. First, we havetiot been able to cover the work 
on, selection problems by psychologists outside the British Slighting 
Services during 1939-47. Much was done, for example, by mem¬ 
bers of the Cambridge University Psychology Department, Uni¬ 
versity College (London), and by other teams and individuals. 
Again, t6 attempt to describe the even more extensive develop¬ 
ments of personnel selection in the American Forces would have 
- been quite impossible at the t im e of writing. Secondly, we have not 
; tidod to deal with military applications of psychology apart from 
Selection—for example, studies o^ training methods, of the design 
' layout of equipment, of the effects of conditions of warfene on 
Vhl^r^e; and so fo^—whether conducted by psychologists inside 
or outside the Services. It may safely be stated that at least as 
valuable contributions to the efficiency of the Forces can be, and 
have, been, made by psychologists in such fields as by those con- 
temed with selection and guidance. The same, of course, is true in 
industry and education. 

The reader’s attention is drawn to two appendices which list, 
first the naval, military and air force abbreviations employed, and 
secondly, the numbers or abbreviated titles of the main psycho¬ 
logical tests referred to in the text. Dates are inserted in the text 
. after the authors’ names to serve as references to their books or 
' articles, which are listed in the bibliography at the end of the book. 

We wish to express our thanks to Mr. Alec Rodger, Col. B. 
Ungerson, Dr, H. J, Eysenck and Mr. B. 8. Morris for reading 
parts, or the whole, of the'manuscript and for their suggestions and 
criticasms; and to Mrs. Dorothy Vernon for help in preparing 
the index. Acknowledgments are also due to the American Psycho^* 
logical Association, Inc., and the National Institute of Industrial 
Psychology for permission to reprint portions of articles (Vernon, 

^ 1947a, 1947b, Parry, 1947) which have appeared in The American 
Tsydiolo^t and Occupational Psychology. ' ' 

Finally, it should be stated that this book is published by. 
permission of the Lords Commissioners of the Admiralty, the 
Army Council and the Ar Council, but the'responsibility for 
imy statements of fact or opmions rests solely with the. authors. 
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PART I 


The oiganisation of aelectlont the 
generkl procedures employed arid 
the work of psychologists in the 
Royal Navy, Army and A.T.S., 
and the Royal Air Force. 




CHAPTER I 


THE RISE OF VOCATIONAL PSYCHOLOGY 

Psychology and the other social sciences can help in the 
solution of many, problems of the modern world, both in war'and 
peace, The Scientific approach is particularly valuable whenever 
prediction of human behaviour enters, as in educational and voca¬ 
tional selection. The chart on page 13 mdicates the main origins 
and the course of development of vocational psychology, which 
are described in this chapter. 


It is a truism that the solution of the problems of individuals and 
of societies in the modern world depends more on progress in the 
social sciences, including psychology, sociology and economics, 
than on further advances in the physical and biological sciences, 
engineering and medicine. The importance of psychology becomes 
particularly obvious during periods of emergency such as the two 
World Wars, since the need for making the most effective possible 
use of human as well as of mechanical resources is then realised. 
It is the object of this book to describe the developments that 
occurred in war-time in applying psychology to the classification of 
British manpower, and espedily of mind or brain power.* 
Scarcely less urgent, however, are the problems of adjusting the 
individual in peace-time to the educational and industrial systems 
of the society in which he lives, and of adjusting these systems to 
the legitimate! demands of individuals. Hence we shall attempt to 
show how far experience gained in the guidance and selection of 
recruits may be carried over into civilian practice. 

, Now the sodal sdences are at the same time the most immature 
and the most intractable of all the sdences precisely because they 
are concerned with man. Individuals differ too widely in th&: 
tastes, talents and temperaments to'be readily amenable to scientific 
analysis and control. As each’ one is usually convinced that he knows 

* This conception of “human engineering" is borrowed from Yoakum and 
yetkea’ book on mentaltests in the American'Amy (1920), and Yerkea’ mentio- 
rmdum to the U.8. Secretary Of War (1841), 
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what is best for himself and his neighbours, he resents beihj^P 
;||a4ied, classified and r 9 gulated by psychologists and economists'^ 
fears that this mvolves forfeiting his free choice and being J 
ydkected in all his doings firom birth to the grave by an army of % 
iSscientific e^^erts—a prospect even more abhorrentthii his present'^ 
:^il:fiection to teach^, employers and civil servants. This suspicion I! 

is harifly warranted, for several reasons. , 

]:,. ^irst, the psychologist admits, more readily iddeed than does thi^|| 
^ ^opinionated layman, ^at he is not omniscient about human nature 
olile prefers not to pronounce definite dectslons except in those 
. fihstimces where he has strong scientific proof of their validity.:'If ' 
;;fact,.hi8 caution is often apt to seem rather irritating. Secondly^^,., 
'^^the good psychologist recognises and allows as much scope as?| 
.. {possible for individuality. He never willingly dictates a course of^ 
action, but attempts rather to clarify the situation confronting th^| 
I^Mi^yidual) snd to draw attention to a}! the relevant factors, so tha|| 
the individual may himself arrive at a wise and satisfying solution^i'^ 


inc inuiviuuiu xuay luinacu arrive a wiae luiu Bauaiyuig 

^ (His method of tackling problems within a group or organisatioh?;j 
such as an industrial firm, or the army, is similar. It certainly doea:|;! 
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rript approximate to psychological totalitarianism.)* Thirdly, it,,J^ 
^hihist be admitted that man’s freedom is already greatly circumf | 
.{i||c;r|bed. Decisions are made over his head by his parents, teachera i,. 

.employers, usually on much more arbitrary and falliblp!^ 
JSfi'QUhds tW those which the psychologist, with his specialised^ 
i^ll^nipg and ingenious techniques, takes into account. ExpreB8ed;i| 
pother way, the life-history of an individual is punctuated by,’ i 
g^^dries of predictipiis. Whenlfis modrer smacks him, she prediofr ;| 
that this will make him behave better in future; his success orj^; 
failure in the Special Place examhution predicts his suitability Qri| 
unsuitability for Secondary education; similarly, his acceptance or,;^' 
rejection for a job, or in marriage, are essentially propheciesy { 
Psychological investigations have proved that many of these pfe^- Jj 
■ i^tfictiops are incorrect, and th^t they are potent sources of unhappif!^ 
delinquency arid neuroste. Predictions by psychologistfl^ 
: {tjieip^elves, fre certainly not perfect, but they are constantly bcinj^^: 
;:|heqkied and itriproved. In many fields, such as the upbringing o|? 
. t ohildfen, in educational and vocational guidance and 8electiori,4 
thrir superiority to the haphazard and unsystematic predictiOri|t|| 

d,. to articles w psychotherapy, isduatrial and educational investlgatioilil 'j 
by tntobwa ofthp Tavistock Clink, Jaques, $t dl. (tM7). , : ^ 
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■of .everyday life has been demonstrated. In others, such as 
international relations and economic policies, psychologists 
are convinced that scientific investigations would help, but are 
Seldom so rash as to claim that their knowledge Is sufficiently far 
advanced to provide immediate remedies for the country’s or the 
world’s ills. 


; Historical Outline 

^^■Ojjily a bird’s eye view will be attempted here of the development 
of the branch of applied psychology with whiclr we are concerned. 
•Viteles (1032) provides an excellent history of industrial psycho- 
logy; a Board of Education Report (1924) describes the rise of 
mental testing, and Flugel (1083) outlines the wider background. 
Until halfway through the nineteenth century, psychology was ati 
; entirely theoretical subject—a branch of philosophy—and the 
■Ifoundation of the first laboratory at Leipzig by Wundt in 1879 
^ epitomised its conversion into an experimental science of huikan 
thought and behaviour. Such work as that of Helmholtz on vision, 

■ heairftig and reaction time (quickness of response), and Galton’s 
'"fertile explorations of individual differences in intellect and mental 
;itoagety (Inquiries Into Human Faculty, 1883), contributed to this 
jpeW' outlook. Among Wundt’s many pupils were the Americans, 
; ,iJ,l.McK. Gattell, who xn the 1890’s investigated the wide range of 
'dj^erences on simple tests of sensory and motor capacities, and 
^l^laftlby Hall, who fostered some of the first systematic studies of 
l|ipi£ldhood. Contemporaneously the nineteenth century witnessed 
increase in humanitarian attitudes towards the insane and 
j^''^tbiejitally defective, and in education and in industry, and these 
-^imtilated the demand for better methods of diagnosing and 
IjHpredicting human qualities and; capacities. 



i _ 

' - Education 

As universal education spread, the tremendous differences ih 


innate learning abilities came to be realised. With the desire to give; 
higher education to thpse who would profit most instead of merely 
to xa .privileged social class, there arose the system of written- 
examinations, The Civil Service adopted examinations to eradicate; 
the ( evils of nomination in 1870, and the professions shhilarly ^ 
■ reformed their qualifying procedures at about this time; 

Such examinations marked a great advance over oral teste h# 
. , ' ■ ■ ■ ' > ■■ : ■■■■ 



THE RISE OF VO\;yVTIONAL PSYCHOLOGY 15 '' 

inspectors in schools and “disputations” in the universities, later 
work has revealed their defe^veness as selection techniques 
owing to unreliability and subjective standards of marking (cf. 
Vernon, 1940). The first tests of intelligence, namely Ebbinghaus’s 
Completion Test and Binet’s famous scale, were constructed at the 
request of the education authorities of Breslau and Paris respec¬ 
tively at the turn of the century. These tests and their numerous 
progeny made more accurate the segregation of mentally defective 
children for special education, also the discrimination between 
backward children who are innately dull and those whose retarda¬ 
tion is due to physical, emotional or other causes. Tests of attain¬ 
ment in spelling, reading and arithmetic, and scales for grading 
hand-writing and composition, were constructed by Thorndike, 
Courtis and others in America around 1910. These were stan¬ 
dardised so that a child’s performance could be compared with the 
average for his age, and accurate surveys could be made of the 
relative achievement of classes or schools. Thorndike also estab¬ 
lished the pattern for scientific studies of teaching methods. The 
superiority or inferiority of a new device over stock methods could 
be demonstrated by the progress made among comparable groups 
of children, instead of by personal impression and arguments based 
on tradition. In 1913 Burt was appointed by the London County 
Council as first psychologist to an education authority and initiated 
his series of researches into intelligence and attainment tests, into 
delinquency and backwardness, and into statistical methods of 
analysing psychological data, which formed the basis of so much 
subsequent British work in applied psychology. 

We must here retrace our steps and mention the impo^nt 
contributions to the conception of mental measurement of math¬ 
ematicians such as Quetelet, who demonstrated the applicability 
of the nonnal or Gaussian frequency curve to human character¬ 
istics, and Galton and Pearson whose technique of correlation 
ihade it possible to measure the extent of agreement between scores 
ohjtwo.or inore testa or examinations (cf. p. 104). From this 
evolved factor analysis, which' is used for classifying the main 
''types of human abilities. In early writings on psychology and 
vediicatioh frequent r^erence is made to faculties or mental powers 
snch as attention, memory, observation, etc. Spearman’s researches 
from 1904 onwards largely discredited these and demonstrated the 
importance of “g’\ the general intellectual factor, underlying all 
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our abilities. Burt and Thomson in England and Thurstone and 
others in America greatly extended factorial techniques and 
demonstrated the existence of additional factors or types of ability 
—^verbal, numerical, mechanical, spatial, etc. 

Educational psychology was profoundly influenced again by 
nineteenth-century developments in evolutionary biology and in 
psychopathology. McDougall’s writings on the instincts and 
sentiments and Freud’s studies of the unconscious motives under¬ 
lying human nature, greatly increased our understanding of the 
difficult or nervous child, the delinquent and the neurotic. The 
first clinic for children was established by Witmer at the University 
of Pennsylvania in 1896, but this was mainly concerned with 
treating scholastic difficulties. In 1909 Healy commenced his 
studies of behaviour problem cases and delinquents in Chicago. 
Burt was the chief originator of similar work in England, followed 
by Boyd and Drever iq Scotland. 

Industry 

While many early psychological studies of fatigue and of con¬ 
ditions affecting learning were relevant to industrial work, the 
first important experiments in industry itself were those of Taylor 
and Gilbreth, in America, around 1910. The stress laid by these 
experts in “scientific management” and their followers on the 
increased output resulting fiom more efficient working methods 
tended to antagonise workers and the Ttade Unions. Although 
industrial psychologists have been at least as interested in ihe 
satisfaction of the worker as in his output, this suspicion has 
unfortunately persisted and was responsible for some lack of 
co-operation in the Second World War between the Ministry of 
Labour and psychologists in the Forces. Nevertheless an important 
series of researches on conditions in British factories was begun 
by H. M. Vernon, under the Health of Munition Workers Com¬ 
mittee, in the first war, and continued by the Industrial Fatigue 
Research Board (later the Industrial Health Research Board)—a 
branch of the Medical Research Council. These were concerned 
with working hours, rest pauses,, heating, lighting, accidents, 
fatigue and boredom, and methods of selection and guidance. 

As early as 1906 a French psychologist, Lahy, was studying the 
abilities required for typewriting and tramway driving, but the 
realisation of the general principles of vocational psyriiology is 
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usually attributed to Muensterberg, who worked in Germany till 
1911 and then in America. He outlined the functions of psychology 
in contributing to industrial efficiency and in serving the interests 
both of employees and employers, and himself developed tests 
for selecting tramcar drivers, telephone operators and others. 
Gradually the view made headway that the choice of the right man 
is no less important than the choice of the right machine. It was 
estimated that, at a typical job, the most efficient workers are fully 
three times as competent as the least efficient. 

1914-1918 

This brings us to the first World War, when psychologists in 
Germany, Britain, France, Italy and America were engaged on the 
development of tests for night vision, for the selection of pilots and 
observers, drivers and telegraphists, for the anti-submarine service, 
etc., as well as on other contributions to military psychology which 
Burt (1942) has described. Few of these tests have survived, and 
in most countries psychologists were too few in number to under¬ 
take large-scale schemes or to have much influence on the service 
authorities. But an outstanding landmark in the history of voca¬ 
tional psychology was the application, during 1917-18, of group 
intelligence tests (devised by Otis and other American psycho¬ 
logists) to nearly two million American army recruits (cf. Yoakum 
and Yerkes, 1920, Memoirs of the National Academy of 
Sciences, 1921). Up to 2,000 a day were tested in any one centre, 
in batches of one to five hundred at a time. A battery of verbal tests 
—^Army Alpha—was used with the majority, but a non-verbal 
series—^Army Beta—Was given to the 30 per cent, who could 
barely read English, and adaptations of the Binet scale, or per¬ 
formance tests, to borderline defectives. The value of the test 
scores was clearly demonstrated, and they could be used for : 

(1) Selecting men of superior ability for officer or N.C.O. 
training, or other special duties. 

(2) Selecting duller mfen for labour or special training units, 
and for eliminating the lowest grades. 

(3) Balancing the distribution of ability between different units. 

(4) Allocating men into groups which would be homogeneous in 
ability and thus helping instruction. 

Though these applications were permissive, not obligatory, 
psychologists had leas trouble in persuading the Army to adopt 
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them than in dissuading it from attaching over-much importance 
to test scores and neglecting physical, personality, and other 
relevant factors. Owing to lack of time relatively little attempt was 
TTiadfi to test aptitudes for specialised training, but extensive work 
was carried out on trade tests for assessing the proficiency of 
recruits from skilled and semi-skilled civilian trades. 

1919-1939 

This demonstration of the possibilities of large-scale sele 9 tion 
procedure provided an immense stimulus to group-testing in 
America. Intelligence tests soon came into general use in schools 
and universities, while achievement and new-type scholastic tests, 
with similar make-up and items to those of intelligence tests, 
largely supplanted the unreliable essay-type examination. Organisa¬ 
tions sudi as the American Council on Education through its 
Co-operative Test Service provided each year new forms of 
intelligence and achievement tests standardised to yield compar¬ 
able scores at the college level. In the 1930’s various methods of 
automatic scoring were devised. With the most widely used 
system (adoptedforpencil-and-papertestsbytheU.S.Forcesinthe' 
second World War), testees indicate their responses to each item by 
drawing a line with a carbon pencil on a special answer sheet. When 
the sheet is placed in an electrical machine, current passes through 
correctly drawn lines and yields the total score instantaneously.' 

In this country intelligence tests have spread widely in primary 
schools. For example Moray House constructs new tests yearly for 
education authorities to apply in selection for secondary education 
at 11H-. In 1932 and again in 1947 the complete 11-year-old 
population of Scotland was tested by the Scottish Council for 
Research in Education. Individual tests such as revisions of the 
Binet scale, performance tests and educational tests (e.g. Burt’s, 
1921) are generally employed in Child Guidance Clinics. But there 
was relatively little testing in secondary schools or universities, or 
at the adult level until 1941, and new-type examinations have 
gained little ground in spite of Ballard’s able advocacy (1923), and 
the exposure by Hartog and Rhodes (1936) and others of the 
inadequacies of conventibpal examinations. Perhaps the most 
notable outcomes of psychological work in 1914-18 were the 
setting up of the Industrial Fatigue Research Board, and the 
foundation by Myers of the National Institute of Industrial 
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Psychology in 1921. The N.I.I.P. is an independent organisation 
■whose chief activities include investigations for particular firms 
into their problems of selection, training and working conditions, 
and the giving of vocational guidance to individual applicants— 
mostly boys and girls leaving school—^who desire advice on suit¬ 
able careers. So far as its limited funds would permit, it too has 
conducted researches, for example into tests of mechanical apti¬ 
tude, of motor-driving and other abilities, and into practicable 
techniques of guidance for school-leavers in large areas. While 
certain firms such as Rowntrees (cf. Northcott, 1946) have 
employed psychologists in their personnel departments, the 
majority of industrial and vocational psychologists in this country 
have been trained by the I.H.R.B. and N.I.I.P., and a great deal of 
the work of these bodies between the wars was directly relevant to 
the problems with which British psychologists were called upon 
to cope in 1941. Selection tests were applied to apprentices in 
the Army and the Royal Air Force, for example by Cox (1928), 
Stanbridge (1930), and Farmer and Chambers (1936). The writings ’ 
of Burt et al. (1926) land Macrae (1932) have been particularly 
influential in the field of guidance. 

The first attempts to give vocational guidance to children are said 
to have been those of Parsons in America in about 1900, but many 
school teachers have long accepted the responsibility of advising 
their leaving pupils unofficially regarding suitable employment in 
the light of their knowledge of the pupils’ abilities and characters. 
Juvenile Employment Committees, working either under the 
Ministry of Labour or under Local Education Authorities, are 
nowadays concerned with vocational placement, and do their best 
to integrate the views of the school, the child’s and parents’ 
interests, and local labour requirements. But much evidence has 
accumulated as to the untrustworthiness of these haphazard pro¬ 
cedures and the superiority of the more scientific guidance methods 
worked ^out by the N.I.I.P. (cf. Chapter VII), Mention should be 
made also of the Appointments Boards or Committees for placing 
university students in jobs, the first of which wais established at 
Oxford University in 1899, A growing number of secondary 
schools have part-time careers masters or mistresses, given intro¬ 
ductory training at the N.I.I.P,, who, in the quite inadequate time 
usually allotted, try to survey the pupils systematically and to give 
them guidance on leaving. Such activities are much better developed 
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in American higli schools and colleges, where they are known as 
“Student Counselling”. They include advice to students on their 
work and educational careers, financial, health and family diffi¬ 
culties, and personality maladjustments as well as vocational plans 
(cf. Williamson, Darley, et al., 1937,1941), The vast “Advisement 
and Guidance” programme of the United States Veterans’ Admin¬ 
istration is on similar lines (cf. Scott and Lindley, 1946), Two to 
three thousand psychologists are engaged at some 400 centres in 
studying the potentialities and needs of ex-Service men and help¬ 
ing ffiem to readjust to civilian life and to reach sound decisions 
regarding jobs or further training, etc. It is desirable to point out 
the dangers of appljring psychology on so large a scale, in view of 
the difficulties of finding reaUy suitable personnel and of training 
them adequately. 

Educational and vocational guidance have often been given, in ' 
this country, by Child Guidance Clinics. During the 1920’s 
clinics spread rapidly, both in America and Britain, with the 
support of the Commonwealth Fund. Their staff usually comprised 
a psychiatrist in charge of diagnosis and treatment, a social worker 
to explore the home background, and a psychologist to survey the 
children’s abilities and to carry out remedial educational work. 
Play and speech therapists were also often attached. This system 
has, and still does, work successfully. But just as vocational guid¬ 
ance has generally been conducted by psychologists and teachers, 
so child guidance is being taken over more and more by psycho¬ 
logists with teaching qualifications—a psychiatrist usually being 
available for consultation on cases showing serious neurotic ten¬ 
dencies. The establishment of minimum qualifications for Associ- 
ateship and Fellowship of the British Psychological Society in 
1941, and of its Committee of Professional Psychologists (Mental 
Health) in 1944^5, has helped to ensure that guidance is being 
given by properly trained persons. Another field in which the 
relationship and ffinctions of psychologist and medical officer are 
gradually becoming clarified is that of psychodiagnosis among 
mental hospital patients or in adult clinics. When psychological 
departments were first introduced in mental hospitals, it was all 
too common for patients to be sent by the psychiatrist with a 
demand for “an I.Q.”. With the development of projection and 
other diagnostic tests (cf. Rapaport, 1944), the psychologist is now¬ 
adays better qualified than his medical colleague to explore the 
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particular mental functions in which deterioration may have 
occurred, and to analyse the general personality structure— 
providedhehas received thorough theoretical, practical, and interne 
training. Such training is difficult to come by in this country as 
yet, but in America several hundred clinical psychologists were 
employed in the army during the war, and at least as many are now 
working for the Veterans? Administration on diagnosis, psycho¬ 
therapy and research with neuropsychiatric cases, in addition to 
those in civilian posts (cf. Miller, 1946; Hutt and Milton, 
1947). 

A vast amount of research, which we cannot attempt to sum¬ 
marise here, was published between Ihe wars on topics of the 
utmost concern to vocational psychologists, e.g. the relative 
influence of heredity and enviromnental conditions on intelligence 
and other abilities; differences between the sexes, national and 
occupational groups, and age changes; new tests and diagnostic 
techniques, particularly in the field of personality; statistical 
methods such as applications of analysis of variance to psycho¬ 
logical data; studies of the structure, and relations between, 
abilities and traits; analyses of jobs and of working methods. One 
matter, however, which we must not neglect is the development of 
vocational psychology in Germany. 

German Military Psychology 

The lead given by Muensterberg was taken up, as Viteles (1926^ 
30) has shown, more enthusiastically by German than by British 
and American industry. Numerous organisations, including 
Kmpps, Zeiss and the Berlin Tramways had psychological depart¬ 
ments largely engaged in devising and applying selection tests to 
applicants for employment. Moreover, while military psychology 
lapsed completely in English-speaking countries during the 1920*3 
and 30’s, Germany built on what had been achieved and actively 
developed it in the reconstitution of her Forces from 1927 onwards. 
It has been claimed, probably with justification, that the speed with 
which the Luftwaffe was re-created was largely due to the excellent 
quality of the human material selected for it. As Farrago (1941), 
Ansbacher (1941) and—^more briefly—^Burt (1942) describe, eadh 
Army Corps had its psychological section, staffed by carefully 
chosen personnel who had undergone military as well as psycho¬ 
logical training, Military psychology claimed to embrace personnel 
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selection, problems of training and indoctrination, discipline and 
morale among recruits, design of machinery and equipmetit, 
analysis of army employments and of leadership, study of the 
psychological characteristics of foreign countries and propaganda 
for home and abroad. But post-war enquiries indicate that very 
little was actually done in fields other than selection. And even 
■here, although tests were devised for pilots, tank drivers, radio 
operators, sound detectors and other specialists, the emphasis was 
less on technical ability than on soldierly qualities—^keenness, 
loyalty, perseyetance- and adaptability. This tendency reached its 
acme in the choice of officers. 

Four or five candidates at a time underwent a three-day pro¬ 
cedure before a board consisting of a colonel (who made the,final 
decision), a medical officer and ffiree psychologists. They were 
tested on the first and third days, and during the intervening day 
their behaviour was closely observed. While many of the tests 
resemble those used in Britain and America, little use was made of 
any scores; instead, qualitative features such as the manner of per¬ 
forming the tasks were stressed, the whole aim being to build up 
a realistic picture of the personality as a whole, not a compendium 
of separate traits and abilities. Nor was any one personality type 
desired. The examination of the candidates, it was stated, should 
keep as closely as possible to the concrete requirements of the 
officer’s life, and the psychologists were required to use their 
practical understanding of human nature as much as their technical 
skill. This examination fell into four main sections; 

(1) Intellectual and other abilities. Ordinary verbal and per¬ 
formance tests were given, usually individually, tests of 
mechanical comprehension, and a sorting test to reveal the 
quality of intellectual generalisation or abstraction. A cinema 
film was shown for candidates to describe, their answers 
showing their powers of observation and imagination. 
Practical problems were posed in the form of essays to bring 
out their planning capacities. 

(2) “Action analysis,” i.e. character and temperament. A com¬ 
plex choice reaction time test was claimed to show power of 
sustained attention or distractibility, speed of learning, 

, fatiguability, and emotional control. In the study of per¬ 
sonality, a test involving interpretation of pictures (Thematic 
Apperception) was used, together with a projection test 



28 


THE RISE OF VOCATIONAL PSYCHOLOGY 

- based on the completion of simple drawings. Other, military, 
tests involved the' drilling of recruits and instructing .them in 
some task or lecturing to them, also carrying out complex 
orders requiring quickness of uptake, agility and endurance, 
and improvisation in emergencies. Candidates were encour¬ 
aged or severely criticised, in order that the effects of these 
social stimuli might be observed. 

(3) Expression analysis. Facial expression was observed, and 
often photographed by a concealed cine camera, during con¬ 
versation or tests involving distractions or painful stimuli. 
Speech, handwriting and literary style were also analysed in 
detail, and the significance of each mode of expression for 
the total personality considered. 

(4) Life-history. The candidate’s party record was studied, and 
data were collected by questioimaire and interview on his 
past development, his aims and ambitions, his self-evaluation 
and social attitudes. At the close there was a discussion of 
some topic by all the candidates, designed to show up their 
competitiveness and other social reactions. Detailed reports 
were drawn up for the consideration of the superior officer. 

Although the selectees, both officers and specialists, were 
followed up at intervals and their progress compared with the 
psychologists’judgments, this comparison was of a subjective kind. 
Excellent or. good agreement was alleged in 80-90 per cent, of 
cases, but there was no scientific validation either of the selection 
procedure as a whole, or of the deductions made from the separate 
tests. In other words the claim that such and such a test revealed 
will-power, imagination, or other vague faculties was entirely a 
priori, and in the absence of empirical substantiation, German 
psychologists had little defence when they fell into disfavour. 
Actually their prestige was already declining in 1930 and their work 
was entirely stopped in 1941. It is generally believed that this was 
due to political prejudice, but Davis (1947) attributes it rather to 
the inherent weaknesses of their approach. 

Conclusion 

With the opening of the second World War, British psycho¬ 
logists y^ere eager to put their training and experience to useful 
service. But the prestige of vocational psychology among naval and 
military authorities and in Government circles was much lower 
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in this country than in America, and there was no strong central 
body for co-ordinating their efforts and drawing attention to the 
contributions they could make. Much valuable work was done 
during 1939-41 by individuals and by university teams, particu¬ 
larly for the Royal Air Force, also on such civilian psychological 
problems as evacuation of children, effects of air-raids, and public 
opinion and morale surveys. The nejct four chapters descrilse how 
the Services came to realise their need for "human engineering" 
and how this was met. 



CHAPTER II 


PERSONNEL SELECTION IN THE ROYAL NAVY 

Abstract ,—Selection in the Royal Navy was somewhat unscientific 
and unsystematic up till 1941, when the Senior Psychologist’s 
Department of the Admiralty was established. The primary object 
of the department was to assist in selection of recruits at combined 
.recruiting centres, and in their allocation to suitable branches at 
entry establishments. The staff of the department, their training, 
and the tests and other techniques employed, are described. Most 
of the qualified personnel were civilians, but interviewing, alloca¬ 
tion, etc., were largely carried out by Wren petty officers or 
personnel selection officers. Important features were the provision 
of information on jobs to recruits, the matching of supply with the 
demands of all the branches, and re-allocation of failures. Personnel 
work was later developed in depots, barracks and specialist schools, 
among Fleet Air Arm mechanics. Marines and artificers; and a 
scheme of selection was integrated with the training of officer 
candidates. Follow-up or validation investigations, other research 
activities, studies of training, documentation, “work-simplifica¬ 
tion” and opinion surveys, are outlined. 


Up till 1941 such selection as existed in the Royal Navy was 
based mainly on educational examinations and on interview. 
Scholastic performance played a large part in the admission of 
cadets and of artificer apprentices, that is of future officers and 
skilled tradesmen, also a smaller part in the acceptance of con¬ 
tinuous service and special service ratings. War-time volunteers 
and conscripts ("Hostilities Only”) ratings were medically 
examined and interviewed at Combined Recruiting Centres 
(C.R.C.S). Both their acceptance or rejection and the category or 
branch to which they were allocated were decided by the naval 
recmiter, who was usually a pensioner chief petty officer or petty 
officer. For certain classes of tradesmen such as electrical mechanics 
trade tests were given. Such specialists as gunnni^terpedo. Radar 
and Asdic (anti-submarine) ratings v^.,ftEaiuo^^ft^osen after 
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a period at sea on the basis of a reconunendatioix from their 
commanding officer, but, owing to shortages of experienced men 
around 1941, the major proportion were merely put on training 
courses, by the drafting officers at the naval depots or barracks 
without any enquiry into their suitability. Neither the recruiters, 
nor the officers concerned with selection, had any training in inter¬ 
viewing, and with the increasing complexity of modem equipment 
and the multiplication of specialist categories, they could seldom 
be expected to know much about the jobs for which they were 
selecting. Many of them employed unstandardised spelling or 
mental aritlunetic questions as tests of education and intelligence*. 
Several training schools had home-made group intelligence tests; 
fortunately, perhaps,little attention was paid to their results. Some 
schools made approaches to trained psychologists for assistance. 
Thus in December, 1940, one of the writers was asked to devise 
selection testa for Asdic ratings, because the quality of men sent for ' 
training was so poor. His battery (described on p, 2S1) was applied 
for the rest of the war, and it certainly prevented a number of 
potential failures from cluttering up the courses. But its validity 
was not high enough for it to pick all the potential successes. In 
all, many hundreds of men ipade the lengthy journey to training 
schools in Scotland, only to be rejected without even starting 
training. Moreover, th? procedure took no account of the equally 
clamant needs of numerous other naval branches. These defects 
could only be overcome by a centrally organised scheme of 
selection. 

The system which worked adequately in the small peace-time 
Navy could barely cope with the vast war-time expansion. Although 
in 1941 the Navy was in the fortunate position of being able to 
reject four out of every five applicants, numerous complaints were 
made by entry establishments, depots, and specialist schools about 
the quality of trainees. For example among seamen torpedomen' 
the failure rate in one of the main schools rose from the pre-war 
figure of under 10 per cent, to 31 per cent, in 1942, but 'was 
brought down to 10 per cent, igain by 1944. Not only were men 
without the necessary mechanical or other aptitudes breaking the 
hearts of the training staffs, but also many with excellent quali¬ 
fications were wasted on relatively unskilled work. The effects on 

' * The dasuc example was the recruiter who was beard saying to a volunteer; 

What ? Spell Egypt with a J f 'y’ou’ro illegible for the Navy." 
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the morale of recruits in general were as serious as the shortages of 
competent personnel. 

Partly owing to the urgency of these difficulties, partly through 
the initiative of the Consultant in Neuropsychiatry, a committee 
was appointed by the Admiralty in March, 1941, to study the 
selection and allocation of Hostilities Only recruits. A.^ Rodger 
(1946) has described the visits paid by the coihmittee to combined 
recruiting centres, entry establishments, depots, training schools, 
and ships of the Fleet; also how its recommendations were 
accepted in the main, the Senior Psychologist’s (S.P.) Department 
being established imder the Second Sea Lord, and new schemes 
instituted at recruiting centres and entry establishments before the 
end of the year. 

The Staff of the Senior Psychologist’s Department 

The backbone of the psychological staff consisted of eight 
civilian psychologists provided by the National Institute of Indus¬ 
trial Psychology. Five other men with psychological qualifications 
joined later, either as civilians or as R.N.V.R. officers, and two 
trained psychiatric social workers served throughout most of the 
war, But ffie bulk of the personnel had little or no psychological 
training at the start. At the peak period they numbered about 300, 
including 60 Wren first, second or third officers (personnel selec¬ 
tion officers), 120 Wren petty officers or C.P.O.’s (recruiting 
assistants), and some 30 Wrens or Marines engaged on clericd 
duties, 16 sick berth attendants (assistants to neuropsychiatrists), 
and about 20 civilian clerks and statistical assistants (cf. Expert 
Committee, 1947). 

The Wren recruiting assistants were mostly drawn direct from 
civil life, where they had been teachers, employment officers, 

. social workers and the like. Though only a small proportion were 
university graduates, they were nearly all of university calibre and 
^ were extremely carefully chosen for the work in view. They 
received a fortnight’s intensive training in their future duties, in 
which the Director of Naval Recruiting and the Consultants in 
Neuropsychiatry and Ophthalmology co-operated. They had to 
acquire skill in interviewing, testing technique and record-keeping, 
and to obtain some acquaintance with the educational systems of 
Britain and with the nature of civilian occupations. Their work was 
regularly supervised by the psychiatric social workers. 
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From their number the most suitable were later chosen for 
promotion to personnel selection officer (P.S.O.). Though they had 
by now gained a good deal of relevant experience, they required 
much additional tndning in interviewing and in knowledge of 
civilian and naval jobs. For it was they who bad to decide whether 
a man claiming to be a fitter, a lathe-setter, a radio mechanic, etc., 
was properly qualified; or whether a recruit would most suitably 
adapt to training as electrical mechanic, stoker, telegraphist, sea¬ 
man, writer, or cook and so on; and their employment recom¬ 
mendations were almost always accepted. Their training consisted 
mainly in an apprenticeship to an industrial psychologist who was 
already carrying out actual selection, but included visits to special¬ 
ist schools, conferences and refresher courses at the Admiralty, 
often a period in a Government traiiung centre to learn about the 
handling of bench and machine tools, and spells at the Admiralty 
to gain acquaintance with the administrative system and recording 
arrangements. 

Certain points deserve comment. First, the Senior Psychologist 
—the director of the organisation, whose status was similiar to that 
of the directors of several other Admiralty departments—was 
himself a psychologist, Moreover, all the policy and administration 
were decided and carried out by psychologists. Although psycho¬ 
logists as such did not at that time receive professional recognition 
in the Civil Service, as scientists or doctors did, yet there was a 
strong scientific atmosphere in the department and an insistence 
on professional standards, which undoubtedly helped to foster 
good work. And although psychologists and P.S.O.s had no 
executive authority in the Royjd Navy, but acted throughout as 
technical advisers, there was far less frustration than is common 
when psychologists work in Government departments or in the 
Services under non-professional direction (cf. Chase, 1947 ). 

Secondly, most of the staff, including the fully-qualified psycho¬ 
logists, worked in the field—at entry establishments, specialist 
schools, depots, etc., and even the few trained persoimel who were 
attached to Admiralty headquarters made frequent excursions in 
order to keep in touch with the realities of selection. When any 
new job was undertaken, the guiding principle was for one of the 
professional psychologists to get it going on the spot and to seek 
proof of the effectiveness of his methods, then to make these 
methods as simple and foolproof as possible in order to hand them 
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over to P.S.0.8 or other less skilled personael, so that he might be 
free to turn his attention to yet another problem (cf. Straker, 1944). 
The same principle is highly appropriate in peace-time industry 
■ and education where, as shown below (p. 100), industrialists and 
teachers will need to be trained to carry out most selection or other 
psychological work. 

The fact that nearly 90 per cent, of the personnel were women 
was due primarily to the non-availability of men of similar intel¬ 
lectual calibre. But the success with which they acquired know¬ 
ledge of male occupations, and handled thousands of individual 
recruits or tested large groups, and became accepted as useful 
members of the staffs of almost all-mfile establishments, was very 
striking. It is more because of the reduction in the size of the 
W.R.N.S. than because of any inability of women to carry out 
selection, tliat much of the post-war .work has been taken over by 
R.N. and R.N.V.R. personnel. This, too, carries a lesson for peace¬ 
time selection and guidance. 

The Recruiting Centre Scheme 

The function of Wren recruiting assistants was to supply the 
recruiters with factual information on the basis of which they could 
more effectively decide the suitability or otherwise of candidates 
for the Royal Navy. This they did by applying: 

(1) A biographical questionnaire, designed to bring out educa¬ 
tional and occupational history, leisure interests and experi¬ 
ence of leadership. 

(2J The Progressive Matrices test, S.P, Test 0. 

(3) Selected plates from the American Optical Company’s 
version of the Ishihara and Stilling colour blindness tests. 

(4) A short interviewi mainly for purposes of clarifying written 
questionnaire responses, but usually including also a series 
of informal questions about nervousness, dieting and the like 
for detecting recruits who ought to be referred for psychiatric 
examination. 

Careful records were kept and weekly returns sent to head¬ 
quarters. The questionnaires and test results of accepted can¬ 
didates were forwarded to the recruits’ entry establishments. 
Although no fixed pass-mark was set on the intelligence test, few 
men from the bottom 10 per cent, of the population were accepted 
unless, in the opinion of the recruiter, they possessed other 
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exceptional qualifications for the Navy. By 1942-3 the Navy had to 
take nearer 80 per cent, than 20 per cent, of candidates, but later it 
became possible to raise the minimum intelligence standards for 
the majority of recruits to approximately the median for the general 
population. An analysis of returns and reasons for rejection over the 
period December, rt41—^May, 1942, showed that out of every 100 
candidates: 

74'4 were accepted 

13*4 were rejected on medical grounds 
4-7 were rejected because fheir trade was not required, or 
because of insufficient trade experience 
2*6 were rejected on grounds of low Matrices scores 
1‘1 were rejected on grounds of low educational level or 
illiteracy (judged from the questionnaire) 

3‘8 were rejected on other grounds 

In the five years following the institution of this scheme, approxi¬ 
mately one million candidates were put through it. But with the 
demobilisation of recruiting assistants at the end of the war, an 
interim scheme was introduced, which the recruiters themseWes 
were trained to operate. A battery of six self-administe ringlO-lS 
minute tests was provided" togetner with suggested st andard s for 
th e various categor ies to which recruits could be allocated (radio 
m echanic, wri ter, seaman, stoker , etc.1. Severa l paraller for ms of 
e ajR oftEeTollow in g tests were cons tructed; 

A. Mechanical information and comprehension. 

^B. Mathematics. C. Spelling. D. Abstraction. 

£. Advanced arithmetic. F. English. 

Only tests A and B are given to all conscripts and volunteers, 
C and D being added with recruits for the Regular Navy, and 
E ^d F serving as educational examinations for certain classes of 
tradesmen. 


The Entry Establishment Scheme 
In 1941 the majority of H.O. recruits were sent to one of four, 
entry establishments to receive their elementary training in sea¬ 
manship. In each of these an industrial psychologist was placed to 
carry out the second phase of selection procedure (cf. Rodger, 
1946) Jennings, 1947). Later, almost all recruits went to a single^ 
' establishment, H.M.S. J?oya/ Arthur. They spent a fortnight here 
before beginning their training, in "kitting up”, and in medical 
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and psychological examinations. In addition to one resident and 
two or more visiting psychologists, there were Ifi P.S.O.s to deal 
with some 1,2Q,0 men a week, and 26 Wren ratings who were 
engaged mainly on test correcting and records. The resident 
psychologist was either a civilian or a R.N.V.R. officer. Also work¬ 
ing with his department were a psychiatrist and a R.N. executive 
officer, whose respective functions were to interview men referred 
by P.S.O.s as possible neurotics or as potential officers. There 
were six main features of the procedure in H.M.S. Royal Arthur, 
namely: 

1. J^ovision qfinformaiion. —Posters, photographs, and bulletins 
describing the work, rates of pay, prospects and necessary qualifi¬ 
cations, were displayed for incoming recruits to study and discuss. ' 
A preliminary talk by the psychologist or a P.S.O. both described 
the selection procedure, and drew attention to the various naval 
jobs, emphasising that it was up to every recruit to do the most 
difficult job of which he was capable, and not merely one which he 
thought would give him a pleasant time. Films and trade test pieces 
were also sometimes demonstrated. 

2. More Thorough Testing ,—^The standard naval battery of four 
pencil-and-paper group tests vras given: 

Test Modified Shipley Abstraction. 

2. Modified Bennett Mechanical Comprehension. 

3a. Arithmetic. 3b. Mathematics. 

4. SquaresTestof Spatial Judgment. 

The sum of scores on these four (with Test 1 doubled) gave the 
total score, known as T2, which was found to be the most useful 
index of all-round potentiality in the Navy. A test of dictation or 
spelling, and a mechanical -H electrical information test, were 
generally applied, ^d wh en time allowed various tests were given | k 
experinofintal trinls. No tests involving apparatus were feasible with 
such large numbers. 

3. Employment Interviewing .—^Though each recruit had received 
a tentative allocation at his recruiting centre, this could often be 
revised at the entry establishment in the light of the additional 
information collected, and of the Navy’s current requirements. He 
therefore received a 10 to 30 minute interview with a P.S.O., lead¬ 
ing up to a final employment recommendation. The interview was 
based on his original questionnaire, to which the results of the 
newly-taken tests had been added. Standardised oral tests of 



32 PERSONNEL SELECTION IN THE BRITISH FORCES 

fitting knowledge, machine tools, sheet metal work, petrol engines, 
diesel engines, and electrics, were applied in suitable cases, and 
other questions designed to elicit trade experience were often based 
on trade test pieces (cf. p. 149). Individual performance tests— 
Cube Construction or revised Kohs Blocks—^were occasionally 
given. P.S.O.s possessed up-to-date information on the quotas for 
different branches and on job requirements. It should be noted 
that the procedure was as much one of guidance as of selection, and 
that the eventual choice, reached after integrating all the evidence, 
was as far as possible one which the recruit himself approved. 
Each P.S.O. was expected to conduct approximately 80 interviews 
a Week, in addition to doing a share of the group testing and 
attending daily conferences on current work- But at rush periods 
P.S.O.s sometimes interviewed more than 30 men a day. 

4. Matching Supply and Demand .—^As indicated above, selec¬ 
tion for at single branch is of comparatively little value when the 
total employable population is limited, since it almost inevitably 
leads to denudiilg other, possibly more important, branches of 
high-grade recruits. The psychologist must weigh up the demands 
of all the branches, and distribute men of the best quality fairly 
among them. Even those which require only a low average intel¬ 
lectual or educational level must receive a sufiicient proportion of ' 
good material to yield potential N.Q.O.s and instructors. Thus it 
was necessary to resist the temptation to allocate all the most able 
men as mechanics, writers, telegraphists, etc., leaving only the 
average and inferior men as seamen, since many seamen would 
later have to specialise in gunnery, torpedos, Radar or Asdic, and 
from their ranks the leading seamen, petty officers, and most 
R.N.V.R. executive officers would eventually be drawn. 

Selection in the Forces was certainly more difficult than, in- 
industry or education owing to the constantly changing demands. ’ 
Depending on operational needs, new categories would suddenly ' 
be started about which little was known, perhaps for security ‘ 
reasons. Other categories would close and large numbers of care¬ 
fully chosen men would have to be transferred. Any stiffening in 
the training syllabuses for specialists might necessitate the raising 
of standards. Expansion of a category might involve lowering t 
standards for a time in order to provide enough personnel. 

6. Re-allocation of Transfers .—^All men who became unfit for 
the branch to which they belonged, or who failed itt their training 
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courses, were returned to H.M.S. Royal Arthur for re-allocation. 
In 1944 a scheme was instituted whereby the training schools h^d 
to supply details as to the reasons for failure. This was particularly 
useful since, when the failure was attributable to mistaken judg¬ 
ment on the part of a P.S.O., she could investigate the case herself 
and profit thereby. The morale of these failed men was often very 
low, hence the employment interview served the valuable function 
of re-establishing their self-respect, as well as discovering a more 
suitable job to which they could be transferred. 

6, Training and Research. —^Lastly, H.M.S. Royal Arthur served 
as a training ground for newly-promoted P.S.O.s, for neuro¬ 
psychiatrists’ assistants, and other selection personnel. Moreover, 
it provided ample opportunities for applying additional tests or for 
other investigations, whereas previously psychologists had scarcely 
been able to obtain more than one hour for testing. 

Other Aspects of Naval Selection 

Collaboration vnth Neuropsychiatrists. —^Though there were many 
instances of fruitful collaboration between industrial psychologists 
and medical psychologists or neuropsychiatrists, the relationships 
of the two groups were less cordial than might have been anti- 
ripated. Some psychiatrists tended to be suspicious of the ability 
jf psychologists to assess temperamental or personality qualities 
^ interviews. Psychologists, on the other hand, were doubtful of , 
, ^ychiatrists’ knowledge of jobs, and did not wish to become too 
jiosely associated, in the eyes of the Navy, with psychoanalysis and 
ibnormal psychology. 

J Although it was proved, in joint investigations, that recruits 
picked by recruiting assistant Wrens as unstable were well worth 
preferring to psychiatrists (cf. p. 167), various practical difficulties 
Iprevented such referrals from actually taking place. In H.M.S. 
’‘Royal Arthur, however, some 7 per, cent, of men suspected of 
; neurotic tendencies by the P.S.O.s were seen by the resident 
psychiatrist, and 0-7 per cent, were recommended by him for 
discharge. He in his turn taught the P.S.O.s what to look for. 

; In their work at naval mental hospitals and clinics the neuro¬ 
psychiatrists were provided with clinical assistants who were 
trained to apply portions of the Wechsler-Bellevue adult intel¬ 
ligence scale, and the Trist-Misaelbrook revision of Kohs Blocks, 
tQiindividual patients. 
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Work at Depots and Branches .—One or more P.S.O.s was 
attached to each of the larger naval holding and drafting centres, 
where they assisted in the re-allocation of recruits returned from 
sea. Many men who had entered the Navy before the S.P. depart¬ 
ment was established were thus tested and interviewed. The choice 
of seamen for training in gunnery, torpedo, Radar or Asdic was 
made at the depots, and, though first priority was always given to 
men who had been recommended at sea, the P.S.O.s helped Con¬ 
siderably in’raising and maintaining the quality of other trainees. 
In practice, almost the whole of the upper half of the naval popu¬ 
lation (i.e. men with good education and work record, not merely 
good intelligence) were recommended for some form of specialist 
training. The same P.S.O.s were largely responsible for collecting 
from specialist schools the results of trairung courses, upon which 
most of the follow-up investigations were based. 

Office Selection .—Experimental application of tests and psycho¬ 
logical interviews in H.M.S. King Alfred—the training establish- 
»ment for R.N.V.R. cadets—gave promising results in 1942. After 
study of the War Office Selection Board procedures (cf. Chapter 
IV), a modified scheme was introduced in the following year which 
spread out the selection process over a much longer period. Many 
possible candidates were ear-marked by P.S.O.s in entry establish- 
' ments, and these were further interviewed by a psychologist and 
a R.N. executive officer. Those picked out underwent their first 
naval training in a new establishment, to which a psychologist and 
a testing officer familiar with army methods were attached. Some 
of the "leaderless group” and other army tests, adapted to naval 
conditions, 'were applied, but the Board’s eventual decision as to 
whether the men should go forward as “C.W. candidates” was 
based mainly on observation of their behaviour during three 
months’ strenuous training. Though there was no psychiatric 
interview, the' psychologist studied each man intensively and 
applied certain projection and other personality tests. 

“New-type” Boards were later introduced for candidates from the 
Fleet, andfor “Special Entry” cadets. The candidates usually spent 
a week under observation, and were tested and interviewed. Thus 
the Admiralty Selection Board received a large amount of informa¬ 
tion on which to base its decisions. Work was also done in selection 
for engineering cornmissions, and with officers needed for fighter- 
direction. Some 10,000 candidates in all were covered during the war, 
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Although the standard testa, making up T2, differentiated well 
at the officer level, several other more difficult tests were tried out, 
the results with which are described in Chapter Xlll. 

Miscellaneous Fields of Selection. —Fleet Air Arm pilots and 
observers were selected and trained by the R.A.F. But telegraphist 
air-gunners mostly passed through H.M.S. ifoyaZ Arthur, and all 
the various types of air fitters and air mechanics (electrical, air¬ 
frames, pTiginft and ordnance) were handled by S.P. department, 
constituting indeed one of its major—and most successfully accom¬ 
plished— raab a (cf. p. 121). Selection and re-allocation were also 
applied on a large scale in the Royal Marines. The transference of 
large groups of surplus R.A.F. and army recruits into the Navy, 
and vice versa, raised serious problems of morale. Talks by psycho¬ 
logists and careful individual selection helped considerably to 
lessen the recruits’ sense of frustration. Some help was given, 
mainly since the close of the war, in allocating Special Service 
recruits and Continuous Service boys. Among boys being trained 
as electrical, engine room, ordnance or air artificers (who had been 
selected by educational examinations) it was shown that a battery 
of mechanical and intelligence tests was of value both in the initial 
acceptance and in the allocation to the different trades. 

Records and Headquarters' Activities, —During the war some 
380,000 recruits passed through the selection procedure outlined 
above. All questionnaires were eventually housed in a central file, 
thohgh a copy of the main items was first made and attached to 
each recruit’s Service documents. Weekly returns of all test 
results and allocations were sent in. The filing system was so 
arranged that details on every recruit examined could be looked up 
at once, and it'was usual for P.S.O.s in depots or other establish¬ 
ments to telephone the names and official numbers of individuals 
with whom they had to deal and to obtain the information forth¬ 
with. Two great advantages of this system were, first, that it was 
unnecessary to re-test recruits since first scores could always be 
traced, and, secondly, that it made possible extensive follow-up. 
The final course marks of, say,' radio mechanics could readily be 
compared with the selection data collected one-and-a-haJf years 
earlier. Frequent returns from the specialist schools were also 
analysed at headquarters in order to provide a check on the quality 
of the material which was being sent to them, or on any changes 
in their requirements which might affect selection procedure. 
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The other mam activities of headquarters staff included: 

(i) The preparation and dispatch of questionnaire forms, test 
material, information bulletins, posters, etc. 

(ii) Validation of selection procedures and other research. 

(iii) Preparation of job analyses, often with the aid of specialist 
ofl5.cers lent by the branches concerned. Information regaiai- 
ing training syllabuses and job descriptions naturally in- 
yolved visits to training sdiools or to ships. But entry 
requirements, pay, promotion, etc., were better investigated 

, at the Admiralty. 

(iv) Liaison with other departments, such as the Directors of 
Naval Recruiting, Naval Training, Personal and Medical 
Services, and the Education Department. 

Psychological Studies of Training 

While selection was the Senior Psychologist's primary concern, 
his department was also authorised to undertake other psycho¬ 
logical investigations. One of the major fields was training, and 
detailed enquiries were ntade into the selection and training of 
Asdic operators. Radar operators and radio mechanics, tele¬ 
graphist air gunners, torpedo ratings and electrical mechanics and, 
in the engineering branch, of artificer apprentices, mechanicians 
and leading stokers. An elementary manual of psychological prin¬ 
ciples of instruction was published (Vernon, 1042), and a critical 
analysis of methods of learning to receive and send Morse code, 
based largely on recent American researches, was prepared. In 
these specialist courses the instructors and lecturers were carefully 
chosen and—at least towards the end of the war—many of them 
had been trained in instructional technique. Thus the psycholo¬ 
gist’s reports were less concerned with their competence than with: 

(1) Success of trainees obtained from various sources, with, 
varying previous experience and backgroimd. 

(2) Propaganda for attracting suitable candidates. 

(3) Organisation of the syllabus: working hours, distribution of 
periods, size of classes, etc. 

(4) Effects of conditions of work and training on morale. 

(6) Functions of theoretical instruction and its integration with 
practical training. 

(6) Use of visual aids induding diagrams, charts, films and 
demonstration models. 
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(7) Methods of examining and marking. 

Attempts were made also to consider how far the “training as 
given,” and examinations at the end of training, correlated with the 
“job as done,” since it appeared in some branches that undue 
stress was laid on knowledge or skills which would seldom be 
required at sea (cf. p. 108). Probably most of the psychologist’s 
recommendations did not carry much weight at the time. Never¬ 
theless a large measure, of goodwill was built up, and by the end of 
the war there was a readiness to consult psychologists in the early 
stages of planning instruction or of producing training devices, 
which would have been unthinkable five years earlier. Though 
psychologists were members of Admiralty committees on training 
apparatus and training aids, there was little opportunity for con¬ 
trolled experimentation. One large-scale investigation, however, 
demonstrated the value of film and filmstrip in the elementary 
instruction of seamen (Vernon, 1946a). Little use was made by 
naval schools of objective examinations until, towards the end of 
the war, American practice and the advice of British psychologists 
began to have some influence. A simplified manual of “new-type" 
examining was drawn up recently. 

Another field where there was scope for psychological advice 
was documentation. Even before the war the N.I.LP, had been 
consulted when the form for reporting on the efficiency of naval 
officers was undergoing revision. Recently a survey on behalf of 
the Director of Naval Recruiting showed how the paper work 
involved in the entry of recruits to the Navy (forms, records and 
returns) might be approximately halved. 

During job analysis and training investigations many instances 
of bad layout or design of equipment were noted. Although such 
matters have largely concerned psychologists in industry, there 
was no ready means of bringing the psychological viewpoint to the 
attention of the naval scientists and engineers responsible for the 
designs. However, the Royal Naval Persoimel Research Committee 
of the Medical Research Council did useful work in bringing 
together medical specialists and representatives of Admiralty 
departments dealing with training, clothing, gunnery, scientific 
research, naval construction, and others. Under its auspices investi¬ 
gations were made not only into equipment, but also into working 
and living conditions in submarines and the small craft of coastal 
forces, in the Arctic and in the Tropics (Critchley, 1947); into 
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problems of night vision, colour blindness, seasickness, and other 
physiological and psychological matters {Brit, Med, Bull., 1947). 
Further, with the co-operation of the War Office time and motion 
study psychologists, a beginning was made into devising simplified 
gun drills, and the Admiralty now has a motion study unit of its 
own engaged on a wide variety of work-efficiency problems. 

Finally, a few investigations were carried out with opinion sur-. 
veys for gauging the attitudes of recruits towards various aspects 
of their living and working conditions. One study dealt with 
opinions regarding new schemes of interior decoration and lighting, 
of ships, others with views on demobilisation problems in which it 
was found, incidentally, that the work of vocational psychologists 
was sufficiently appreciated for a large majority to favour the 
giving of psychological advice on post-war careers. 

Conclusion 

While the ratings and officer candidates who underwent selec¬ 
tion procedure were, in general, strongly in favour of^ its' impar¬ 
tiality and its careful consideration of the capacities and interests 
of each individual, senior officers sometimes tended to be more 
cautious. The layman is often frightened of people who are sup¬ 
posedly expert in reading character or in explaining -human 
behaviour. He dislikes their inquisitiveness and their criticisms of 
old-established ways of doing things, also their frequent resort to 
statistical evidence. It, was natural also that the use of women to 
report on suitability for male occupations, and even to recommend 
possible officer candidates, should come under fire. Gradually, 
however, as the war progressed, and the value of the psycho¬ 
logists’, P.S.O.s* and recruiting assistants’ work became manifest, 
these objections died down and many critics were converted. In 
1946-7 far more requests for assistance both in familiar and in 
fresh fields were received than the depleted staff could possibly 
cope with. 



CHAPTER III 


OTHER RANK SELECTION IN THE ARMY AND A.T.S. 

Abstract,—Eaxly experiments in the testing of Army and A.T.S. 
recruits achieved only a limited success. In. 1941 the Directorate 
for Selection of Personnel (D.S.P.) was established, with powers to 
test, interview and make employment recommendations before 
recruits were assigned to their Arms of the Service. The staff of the 
directorate was almost wholly military, and certain difficulties 
arising from the restriction of the functions of qualified psycho¬ 
logists are pointed out. The work of selection was carried out by 
personnel selection officers and sergeant testers during the recruits’ 
primary training. This consideration of each individual’s abilities 
and interests at an early stage in his army career had excellent 
effects upon morale. The testing, interviewing and other pro¬ 
cedures, and the ear-marking of potential officers and tradesmen, 
are described, together with the corresponding methods in the 
A.T.S. Much work was done in the re-allocation of converted 
units, of misfits, and other groups at home and overseas. A detailed 
analysis of Army and A.T.S. employments was made in 1941-2 
and kept up to date. The directorate was not concerned with appli¬ 
cations of psychology outside selection, but training, motion study 
and other investigations conducted independently are briefly 
mentioned. 


The Army had even greater difficulties than' the Royal Navy or 
R.A.F. in meeting its skilled manpower requirements, since it 
attracted a smaller proportion of men of high intelligence and 
education, and, being unable to reject any except the medically 
unfit at recruiting centres, it had to take in a considerable propor¬ 
tion of very dull men. Recruits were posted on call-up by the 
Ministry of Labour straight to some Corps or Amt, and there was 
no ready means for subsequent re-allocation of the unsuitables. 
The complexity of modern war has greatly increased the need for 
specialists, and the more technical Arms in particular failed to get 
enough tradesmen or trainable men. Moreover, as the Beveridge 

89 
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enquiry showed in 1941, many highly skilled men were engaged 
on semi-skilled or unskilled jobs. The effects on instruction, and 
on the morale of misplaced soldiers, were extremely serious. It 
was noted, for example, that more neurotic breakdown occurred 
among recruits under training than in battle, and unsuitability of 
employment appeared to be one of the factors responsible fcf. 
Sutherland and Fitzpatrick, 1946). 

In 1939 the Army Council agreed to have an intelligence test 
given experimentally in military training units, under the technical 
direction of a small group of civilian psychologists, headed by E. 
Farmer. They devised a 20-minute omnibus verbal and mechanical 
comprehension test known as F.H.3 (later F.H.R.), which for the 
most part had to be applied by regimental officers who were 
unskilled in testing. Nominal rolls of low scorers were drawn up 
to which the Army psychiatrists occasionally had access. But as 
there was still no effective provision for transfer of men to other 
Arms, or for the discharge of mentally defective or unstable 
recruits, very little was accomplished. 

An alternative scheme, put forward by A. Rodger early in 1941, 
with the help of G. R. Hargreaves, was the foundation of more 
fruitful developments. Hargreaves, an Army psychiatrist, had 
already obtained promising results with the Progressive Matrices 
test among R.A.M.C. personnel, and he influenced General Sir 
Ronald Adam, whose personal interest and energy, when he 
became Adjutant-General, played a major part in what followed. 
Eventually an advisory committee of-psychologists was appoin1;ed, 
consisting of Myers, Drever, Burt and Philpott. Their main recom¬ 
mendations were that a new directorate for selection of personnel 
be set up, that an intelligence test be given at recruiting centres on 
medical examination, and that more detailed psychological and 
psychiatric examination be carried out at mobilisation depots, 
before recruits were allocated to any Arm (cf. Myers, 1942-3). 
These proposals were implemented in July, 1941, though it was 
not imtil a year later that the third one—namely, the General 
Service scheme—could be put into effect. Psychological staff had 
to be collected, personnel to carry out the programme had to be 
trained, a job analysis made of the main Army and A.T.S. employ¬ 
ments, and suitable tests and procedures devised, before effective 
selection could be instituted (cf. Tuck, 1946; Directorate for 
Selection of Personnel, 1947). 
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The Staff of the Directorate 

All executive functions were vested in the hands of Army 
officers, partly because of ^e small numbers of trained psycho¬ 
logists available, but mainly because it was thought that only a 
qf-liPTnft run by soldiers would be acceptable to soldiers. The 
director, a brigadier, had to keep in intimate touch with the con¬ 
stantly changing requirements of tiie Army. Under him was a large 
staff of non-technical officers responsible for organisation, admin¬ 
istration, supplies, etc., and all the day-to-day application of 
selection was carried out by personnel selection officers (mostly 
regimental majors or captains), presidents (colonels) and military 
testing officers at selection boards, and testing assistants (sergeants) 
who were giyen a small amount of psychological training. Psycho¬ 
logists, all but two of whom were put into uniform, were confined 
to planning and inspection of selection procedure, training the 
non-technical staff and carrying out research and development 
investigations. At the peak period there were 19 psychologists 
(including 6 women) and 31 officers or promoted sergeants with 
some training (6 women), together with nearly 600 non-technical 
officers and 700 N.C.O.s (of whom about 60 and 200 respectively 
were women). In addition a few qualified personnel were attached 
to the Directorate of Biological Research, the Department of the 
Scientific Adviser to the Army Council, and the Director-General 
of Army Medical Services (cf. Expert Committee, 1947). 

Most of the trained psychologists were brought in from the 
universities; but the chief psychologists to the Army and A.T.S. 
and four others were past or present members of the N.I.I.P. staff. 
P.S.O.S, M.T.O.s and sergeant testers were volunteers from the 
Army, chosen in the first place for good intelligence scores, 
relevant past experience, and apparently suitable personalities. 
Sergeants had two or three weeks’ training which dealt with the 
nature of the tests, their application and scoring. P.S.O.S received 
a month’s training in principles of vocational selection, job 
analysis, assessment of educational level and occupational experi¬ 
ence, Army requirements and documentation, the nature and use 
of tests and interviewing technique. A stiff written and practical 
examination was set before they were finally accepted, the former 
covering knowledge of procedures. The latter consisted of inter¬ 
viewing “stooges,” obtaining information from them and making 
employment recommendations. Little or np instruction was given 
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in general psychology as such, though reading was encouraged. 
Headquarters administrative staff and board presidents had no 
systematic training (the higher the rank and the greater the 
responsibility, the more difficult it became to demand technical 
qu^ifications). Nevertheless, many of them voluntarily undertook 
considerable study of the psychological background of their work 
and of the techniques they had to apply. 

While the huge size of the task to be done in the Army necessi¬ 
tated a large non-technical staff, it seems doubtful, judging by the 
experience of the Navy, whether quite so much timidity about the 
reactions of the Army to psydiology was justified. Without' 
belittling in any way die very fine work done by all types and - 
grades of staff, it must be admitted that the organisation did not 
always function smoothly. Psychologists suffered considerable 
frustration. They were commonly of lower rank than their non¬ 
technical colleagues. Policies whidi they advocated as scientifically 
sound were often rejected, and the methods they devised were 
often nusapplied and misinterpreted by insufficiently trained 
personnel. Their training had perhaps predisposed them to seek 
what was best for the interests of the individual soldiers with 
whom they had to deal, and they were less experienced in envisag¬ 
ing the broader needs of the Army as a whole. Again, the fact 
that they were immured in headquarters (apart from occasional 
visits for inspection or for carrying out experimental investigations) 
fended to widen the gulf between the technical and the practical 
aspects of selection. The lot of the sergeant testers was particularly 
deplorable.’ Many were highly intelligent teachers and university 
graduates, but, except for a few brought to headquarters as 
research assistants and statisticians, they were restricted to routine 
application and scoring of tests under the command of P.S.O.s 
whose educational and psychological qualifications were sometimes 
inferior to their own, and were liable to be put on to cutting the 
grass or other duties at the whim of any C.O. Aquartertoathirdof 
them eventually achieved commissions by the same route as other 
Army recruits, but not oh grounds of technical competence at 
their work. Presumably the lesson to be drawn is that psycholo¬ 
gists cannot expect a complex institution like the Army to accept; 
novel procedures merely on scientific grounds, that gradual edu¬ 
cation and infiltration rather than the imposition of technically 
valid methods are needed. 
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The General Service Scheme 

The introduction of the Matrices test, given by sergeants at 
recruiting centres, allowed a better distribution of the available 
good-quality material among the Arms that needed it most, 
although no recruits were rejected by the test. When the General 
Service scheme came into operation, recruiting-centre testing was 
abandoned (except for regular volunteers). During the winter and 
spring of 1941-2 there was intensive work on an Army job analysis, 
the preparation of new teats and experimental trials of the full- 
scale procedure. Under the new scheme all recruits were enlisted 
into the General Service Corps and sent to one of the numerous 
primary tr ainin g centres (P.T.C.S) up and down the country for 
six weeks, during which selection took place and a common 
syllabus of initial training was given. Some 160 P.S.O.s and 600 
sergeants dealt with the fortnightly intakes which normally 
included some 12,000 men, but occasionally rose to double this 
number. The following tests were given: 

Progressive Matrices. 

Teat 2. Bennett Mechanical Comprehension. 

Test 3A. Army Arithmetic and Mathematics. 

Test 4. Squares test of Spatial Judgment (dropped in 1943). 
Test 26. Verbal test (initially Test 17. Messages). 

Test 16. Physical Agility. 

Additional confirmatory tests were applied to those considered 
suitable for certain types of employment, namely: 

Test 12 or 21. Clerical or Instructions (later introduced into 
the general battery in place of Squares). 

Teat 8. Assembly test of Mechanical Ability. 

Tests 10 and 19. Morse Aptitude and Auditory Acuity. 

Scores on each test were expressed as Selection Grades or S.G.s 
(cf. p. 177). After follow-up investigations had revealed the value 
of the Matrices, Bennett, Arithmetic, Verbal and Instructions 
tests in most Army jobs, the S.G.8 on these tests were summed to 
yield the measure of general intelligence known as Summed S.G. 

A qualification form (cf. p. 132), similar to the naval biographical 
questionnaire, was filled in under supervision of the tester. This 
was used as the basis for the interview of each man by the P.S.O., 
I in which the details were clarified, the recruit's interests consulted, 
and employment recommendations reached. Most of these recom¬ 
mendations consisted only of broad categories, both beoause 
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sotting had to be carried out centrally, and because supply could 
not otherwise be ma tched with demand for all the tremendous 
variety of Army jobs. The following list of T.R.s (Training 
Recommendations) was used, after some revision. 

0. Potential tradesmen. 

1. Drivers. 

2. Mechanical maintenance (later omitted). 

3. Signallers. 

4. Constructional, building, and other heavy duties (originally 
layers and line operators). 

5a. Clerical and 5b. Storekeeping. 

6. Mobile and combatant duties, including riflemen and gun 
numbers. 

7. Orderly duties. 

8. and 9. Pioneer Corps or discharge. 

Three recommendations were given in order of suitability, 
though often a recruit received two identical T.R.s, e.g. 067, mean¬ 
ing that both his first and second choices were for straightforward 
combatant jobs. The T.R.s and other information such as age, 
medical category and Matrices (later Summed S.G.) results were 
forwarded to the War Office and entered on punched cards for 
sorting and posting. If the requisite number of, say, drivers for 
R.A.C., infantry and other Arms could not be obtained from men 
with T.R.1 first choices, second choices were drawn upon. The 
postings were then sent back to the P.T.C. within three weeks, and 
the P.S.O.s entered the final T.R,, this time with a suggestion 
as to a specific job within the T.R. (e.g. dispatch rider, tank 
driver, etc.). 

In making their recommendations P.S.O.s had, in addition to 
Army job analyses, lists of minimum standards on the various 
selection tests, medical categories and other qualifications regarded 
as essential for each T.R. Some perhaps applied this scheme too 
mechanically, while others were too apt to ignore it and trust their 
I own subjective judgments. But the majority certainly attempted to 
integrate all the relevant information about each man, including 
his test results as one important but not overwhelming factor. 
The P.S.O.s’ work and the sergeants’ testing and scoring were 
supervised by Command P.S.O.S who were psychologically quali¬ 
fied officers. Moreover, records of the allocations were forwarded 
to headquarters where the more glaring misplacements or disregard 
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of instructions were checked. The occasional P.S.O.s who were 
clearly incompetent could thus be eliminated from, the directorate. 

Some 14 per cent, of recruits were referred to an Army psychia' 
trist working at the P.T.C., including all those of very low 
intelligence, men considered to be poor in “combatant tempera¬ 
ment,” and possible neurotics or psychopaths. Many of these 
were passed as normal, but those with psychiatric disabilities 
were either allocated to appropriate hon-combatant employments, 
or to the Pioneer Corps (armed or imarmed), or sent to mental 
hospitals or discharged. By wise placement many emotionally 
unstable.men were helped to better adjustment than they showed 
in civil life, and the Army’s use of mentally defective and very dull 
T pp.n in the Pioneer Corps was an outstanding success. Recruits 
who as civilians had been drifters or unemployables were found to 
give excellent service at simple labouring tasks, when working and 
living with others of a similar level of ability under specially chosen 
officers and N.C.O.s. Their morale was high, and their sickness 
and delinquency rates low. Moreover’, the dull, hard jobs of the 
Army were probably better done than they would have been by 
more intelligent recruits (cf. Rees, X946). 

Potential tradesmen were also identified individually. P.S.O.s 
were provided with a Guide to Civilian Occupations which helped 
them to assess both the amount of trade experience claimed by 
recruits and its relevance to Army trades. Later, a series of written 
tests of trade knowledge was prepared, each covering a particular 
Army trade. Often the number of skilled, or semi-qualified men 
was insufficient; and the balance was made up from inexperienced 
recruits of high intelligence who were considered to be rapidly 
trainable. A “Short List” was issued each fortnight of the trades 
forwhich candidateswerespeciallyneeded. Actually,as shown later 
(p. 120), these men put up by P.S.O.s did distinctly better at trade 
courses than the semi-qualified tradesmen submitted by the Ministry 
of Labour, or others put up at their own request, or by their C.O.s. 

P.S.O.s also had the task of assessing good or poor employment 
record, very strong or weak combatancy or aggressiveness (some 
6 per cent, of men were usually placed in each extreme category), 
and suitability as a potential officer. Those in the latter group were 
automatically sent to War Office selection boards after six months’ 
training, along with other candidates recommended from their 
units, and, m fact, provided the major source of officer'material 
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in the second half of the war. When strong experimental evidence 
had accrued as to the value of certain of the standard tests in 
selecting successful candidates, P.S.O.S were instructed to take 
particular account of the combined score on these tests*. Many 
however, preferred to rely on their own hunches regarding the 
oiHcer-like qualities of their interviewees. 

Between 1942-6 some 700,000 recruits passed through the G.S. 
Corps of whom 9 per cent, were chosen as potential tradesmen and 
6 per cent, marked as potential officers. Each man’s qualification 
fo/m, with test results and T.R.8 entered, was attached to his 
Service documents, but a copy of the main items was entered on a 
card and stored in headquarters for research purposes. Unlike the 
naval system, there was no easy means of disinterring the selection 
information about any given individual, and when recruits came 
up for later re-allocation, their original scores were seldom avail¬ 
able. Many men went through the same battery two or more times 
—a proceeding whose bad effects were only partially overcome l?y 
the provision of retest norms. Not until the end of the war was it 
possible to prepare parallel versions of the chief tests. 

Although less stress was laid in the Army than in the Navy on 
self-guidance, F.S.O.a gave an introductory talk to each batc^ of 
recruits and spent part of their interviews providing information 
on the types of employment likely to suit them. 

In conclusion, selection procedure on so large a scale had its 
inevitable weaknesses, but there is no doubt that the fair and 
thorough consideration given to each individual had a most favour¬ 
able effect on morale. Even apart from the saving in misplaced 
manpower, of which evidence is given below, this feature provided 
justification for its introduction into the Army. 

, •• \ 

Initial Selection in the A.T.S. 

Selection procedure in the A.T.S. which is fully described in 
an .article by Mercer (1946), differed little from that in the Army 
and was run under the same directorate. By 1942, with the intro¬ 
duction of compulsory national service for women, the numbers 
in the Service reached over 200,000, and there were more than 
a hundred different employments. Some of these could be filled 

' * The combination wae score on Instructions-I*Bennett, minua j^e acore on 
Arithmetie+Matribes. Thia doaely duplicated the weightbae obtabed from tiie 
multiple regression equation (c£, p. 104) bettVeen ihe G.S. test battery and 
W.O,$,B. Pass or Fail. 
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directly by recruits engaged in similar civilian jobs, but there were 
many with no civilian counterpart, such as instrument numbers 
(anti-aircraft personnel), wireless operators, tinsmiths and fitters. 
Again many different types of work might be involved in jobs 
called by the same name; for example, there were eighteen kinds of 
clerk. One of the first steps, therefore, as in the Army, was a job 
anal ysts of the main varieties of employment, special attention 
being paid to the conditions under which the work would be done 
(sedentary, outdoor, dirty, long hours, etc.), and the degree of 
individual responsibility involved, as well as to the capacities and 
training needed. 

' At the beginning of the war most of the allocation was done by 
the commandants of the basic training centres who often had to 
I deal as best they could with, perhaps, 260 recruits in an afternoon. 
Early in 1941 an experimental battery of sensory-motor and per¬ 
ceptual tests, devised by the Cambridge Psychological Laboratory, 
was instituted for the 'selection of A,A. instrument numbers. This 
was taken over by the directorate, but later found to be less reliable 
and valid than ordinary paper-and-pencil tests (cf. p. 248). The 
Matrices test was given at recruiting centres by A.T.S. sergeant 
testers from 1942 onwards, and between 6 per cent, and 26 per 
cent, of the lowest scorers were rejected, depending on the number 
and level of recruits required. It should be noted that there was 
no temptation to fail this test in order to escape conscription. Most 
recruits were volunteers, and all the others had chosen the A.T,S. 
in preference to some alternative form of service, There was no 
interviewing or allocation at recruiting centres. 

At the A.T.S. basic training centres, the procedure was very 
similar to that in Army P.T.C.S, except that, with the smaller 
numbers, recommendations for particular jobs could be made 
instead of for general types of duty. Usually two or three sugges¬ 
tions were listed, which enabled supply to be matched with 
demand. P.S.O.s in the A.T.S. were chosen to be as tactful and 
sympathetic in manner as possible, since many personal problems 
were apt to be raised by recruits. Referrals to (women) psychia¬ 
trists were of value, but were less frequent than in the Army, both 
because of the exclusion of low Matrices scorers and because undue 
lack of combatancy did not need to be catered for. A large amount 
of information on jobs was made available to recruits by means of 
photographs, films and talks. With the exception of the Agility 
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teet, almost the same battery of tests was used, but a simple form 
of computation-checking test was substituted for the Arithmetic 
3A test (cf. p. 229), a spelling test was added, and standardised 
stenography and typewriting tests were available for those 
pJaimiTig clerical experience. Nearly 40,000 recruits passed 
through this procedure of whom, it was subsequently found, 
94 per cent, were successful in the training to which they 
had been assigned. 

Re-allocation and Conversions 

Probably as much work was done on re-allocation or transference 
of men and women to new jobs as on initial selection. During 
1942-3 some hundred battalions were converted from Infantry to 
other Arms. Later, when the danger of air-raids decreased, most of 
A.T.S. anti-aircraft employment was closed down. Casualties in 
the armies overseas, or medically down-graded men might need 
new employment. Many units formed before the institution of 
selection procedure contained numbers of misfits who, having 
failed at the first job to which they had been posted, were put on 
to orderly duties or had nothing to do. Recruits of long service in 
the A.T.S. (enlisted before 1942) could be recommended by their 
C.O.s for up-grading to a more skilled and better paid job. Finally, 
some new need might arise, particularly in overseas units for which 
suitable men had to be rapidly collected from other jobs, as when 
the Army of the Rhine urgently demanded deep-sea divers for 
work at Amsterdam Harbour, and the demand—^which would 
normally have taken weeks of correspondence with the units—was 
instantly met by the D,S.P. documentation scheme. 

Some of the more straightforward re-sorting could be done by 
Command P.S.O.s or headquarters staff merely through scrutinis¬ 
ing the qualification forms of recruits who had been through 
General Service. For large-scale schemes, personnel selection 
teams visited the units and applied the usual testing and inter¬ 
viewing procedure. Such teams were also set up in Egypt, North 
Africa, Italy^etc., to deal with local problems. In 21st Army Group 
records were collected before D-Day on Findex cards, to facilitate 
selection for special duties and reorganisation or regrouping after 
battle. Every N.C.O. and other rank, in addition to being tested, 
was rated on a number of qualities pertaining to efficiency at a con¬ 
ference held between a P.S.O. and officers or N.C.O.s who knew 
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the men best. The same technique was o£ value in disposing of the 
36,000 A.T.S. in A.A. units. 

Perhaps the most noteworthy re-allocation scheme, from the 
psychological angle, was that of the Army and A.T.S. selection 
centres, where misfits, down-graded recruits and the like were 
sent. All were re-tested, medically examined, and very thoroughly 
interviewed by P.S.O.s, and a third to a half were seen by psychia¬ 
trists. A conference was held between the C.O., the medical 
oificer, psychiatrist and P.S.O. at which final decisions were taken, 
about 30 recruits being covered in 2-2J hours. Those for whom no 
appropriate employment was available could be, and were, dis¬ 
charged. This system was particularly valuable in raising morale, 
and substantially similar methods were used for dealing with 
escaped and repatriated prisoners of war. In a single year (1943-4), 
30,000 to 40,000 men passed through A.S.C.S, and a similar 
number were re-allocated overseas; 76,000 returned prisoners were 
dealt with during 1046. 

Miscellaneous Psychological Activities 

The technical staff at headquarters engaged on routine statistics 
and records, development and research, was much larger than in 
the Royal Navy. Some 20 technical officers, P.S.O.s and psycho¬ 
logists and 60 sergeant assistants were often involved. Test con¬ 
struction and follow-up are described ih later chapters. Job 
analysis was continually being extended to cover the smaller 
categories omitted in the first survey, and to allow for any altera¬ 
tions in the nature of employments. The W.O.S.B.s and A.T.S. 
had their own job analysis teanh. The usual method was for an 
officer specialising in an employment to collaborate with the ’ 
technical officer. The two visited units to study the recruits at' 
their training or on the job. Towards the end of the war, simplified 
editions of analyses of all the jobs in certain Arms were issued 
which aimed to convey to P.S.O.S, by means of diagrams, drawings' 
and photographs, as vivid a picture as possible of the work and 
the conditions under which it was done. The minimum standards 
or other qualifications issued for each employment jdso underwent 
continuous revision. An additional P.S.O. was attached to each 
Arm in order to keep D.S.P. in constant touch with its selection 
problems, and to advise on any changes. 

One branch of D.S.F. was concerned solely with exceptional 
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recruits who required special employment recommendations. 
Some of these were psychoneurotics; others were men of very high 
intelligence, but with no signs of officer quality or of mechanical 
aptitude—for example, university graduates in languages or 
classics, for whom posts as interpreters, script writers, radio 
announcers, etc., could often be found. Another activity was deal¬ 
ing with "atrocity stories,” i.e., complaints by senior officers or 
politicians of gross misplacements, or rejections of particular 
recruits. 

Much assistance was given to psychologists in the Allied Forces, 
Thus, tlie Dutch, Belgian and French Armies set up similar 
selection organisations and translated several of the S.P. tests. 
Several D.S.F. teams were sent abroad to study and carry out 
selection of non-English speaking recruits—Indians, Palestinians, 
East and West Africans, etc. 

Boy tradesmen are recruited for the regular Army, much as for 
the Navy, by academic examinations at about the age of 14, and 
receive some four years’ apprenticeship training. Here, too, the 
value of mechanical tests and of an interview by an industrial 
psychologist were demonstrated (cf, p. 246). 

The directorate was specifically established to deal with selec¬ 
tion, This had the advantage that it could prohibit units from 
applying home-made tests. Fdr example, one mechanic’s branch 
was in the habit of throwing out numerous promising mechanics 
on the basis of an unstandardised general knowledge examination. 
But it also meant that there was no brief for carrying out other 
psychological work. Several researches were, in fact, undertaken 
as, being relevant to selection; for example, the effects of men¬ 
struation on A.T.S. test performances, the effects of attendance at 
physical development centres or at basic education courses, or of 
membership of pre-Service organisations (cf. Chapter XI). But 
traiiung investigations in particular fell outside the directorate’s 
scope. 

Two memoranda, on methods of Army instruction were drawn 
up by Burt and Valentine (1942, 1943) on the basis of replies to 
questionnaires sent to past students who were serving. The main 
defects mentioned were: 

(1) Mechanical, parrot-like teaching. 

• (2) Unnecessary enumeration of parts of weapons at early stages 

(3) Use of technical vocabulary and unfamiliar words. 
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(4) Too much crovi^ding of instruction; bad organisation of 
courses. 

(6) Lack of “learning by doing”; lack of appeal to interest. 

(6) Lack of visual aids, and of elementary textbooKis. 

(7) Use of sarcasm or abuse by instructors; unsympathetic 
handling.' 

(8) Inadequate selection of N.C.O.s suited to instruction; lack 
of training in instructional technique. 

(9) No homogeneous grouping of classes in ability. 

(10) Haphazard and subjective examination techniques. 

Later, W. Stephenson, who was appointed Consultant Psycho¬ 
logist to the Army Medical Services, carried out extensive time 
studies and other investigations of instruction, recording^ for 
' example, the relative times devoted in specimen periods to demon¬ 
stration, to practice and to oral instruction, and the use of visual 
aids. Several reforms arose out of such enquiries, and "methods 
of instruction teams” were trained which toured the units and 
Army schools. 

A time and motion study group, headed by Ungerson (1946), 
was established under the Scientific Adviser to the Army Council, 
which recorded existing drills for gun teams and drew up revised 
and more efficient methods. For example, the rate of fire of a 
coastal artillery gun was doubled, the time for unloading a jeep 
and trailer from a glider was reduced from ten to two minutes, and 
an improved scheme of handling ammunition in a tank was worked 
out. Other problems with psychological bearings which were 
tackled by the Medical Research Council, the Directorate of 
Biological Research, or the Directorate of Army Psychiatry are 
mentioned in the Expert Committee’s Report (1947). These 
included design and layout of equipment and of operations-rooms 
controls, effects of poor ventilation on motor co-ordination, effects 
of certain drugs, and morale and disciplinary studies. 



CHAPTER IV 

ARMY OFFICER SELECTION 

iitorac/.—The old-type interview boards for the selection of 
ofScers showed considerable shortconungs by 1941, and new War 
Office Selection Boards were set up, each staffed by a president, 
a psychiatrist, military testing officers arid a psychologist or 
sergeant testers. These conducted much more thorough two- or 
three-day examinations of candidates and—whether their methods 
were entirely technically sound or not—they won the confidence 
of the Army and stimulated a continuous fiow of good material, 
The functions of the various board members are discussed in some 
detail, also certain conflicts which arOsSe between the military and 
psychological viewpoints. Psychologists were chiefly confined to 
training, research, and the application of intelligence and projec¬ 
tion tests to candidates. The diagnosis of the character qualities 
which are so important to an officer was mainly done by psychia¬ 
tric interview, and by the “leaderless group" and other practical 
exercises. These were designed not so much to reveal officer traits 
(e.g. leadership) or abilities, as to bring out the candidates’ social 
reactions under conditions of strain. Some criticism is made of the 
scientific value of these diagnostic methods, but they nevertheless 
have important bearings on the selection of executives, civil 
servants, and other high-grade personnel in peace-time. 


In the first two years of the war, the selection of temporary 
officers was based on recommendations from C.O.s and quarter- to 
half-hour interviews of the canefidates by boards of senior officers. 
The system worked fairly effectively so long as there was a large 
supply of good material, e.g. from the public schools. But when 
this source began to dry up, the boards, being faced with recruits 
whose social and educational backgrounds were entirely unfamiliar, 
were unable to discriminate effectively. Unsuitable candidates 
were often passed and sent to O.C.T.U. (Officer Cadet Training 
Units), .where large proportions failed, with unfortunate effects 
on the morale of the remainder. Moreover, so many candidates 
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who might have succeeded were rejected by the boards, often— 
to their own accounts—on flimsy grounds such as 
Grammar School education or socialist opinions, that recruits lost 
confidence in the system, and there was a real danger of insufficient 
officers being forthcoming. Again, there was less opportunity than 
in 1914-18 for selection on the basis of performance in battle. 

Early in 1941 experiments were carried out by two psychiatrists 
attached to Scottish Command, encouraged by the G.O.C., Sir 
Andrew Thorne, who had previously been military attach^ at 
Berlin and had observed some of the elaborate selection techniques 
developed by German military psychologists (cf. Chapter I). 
Officers attending courses at the Edinburgh Company Conunanders 
School were given psychiatric interviews and intelligence teats, 
together with other tests on the German model. While the latter 
gave unpromising results, the correspondence between the psychia- 
'tric diagnoses and the school’s estimates of the officers’ worth was 
ve:7striking(cf.p.l60). By 1942 the first experimental “new-type” 
War Office Selection Board has been set up. The methods worked 
out here were adopted by other new boards, a dozen of which were 
started in various parts of the country by October, 1942. A.T.S, 
Officer Selection Boards followed in 1943. Later boards, on the 
aartifi lines, were attached to Armies in the Middle East, India, 
Italy and Western Europe. By the end of the war some 140,000 
candidates had been through the new procedure, of whom about 
60,000 passed. 

So many popular descriptions have been published that a brief 
outline of the procedure will suffice (cf. Garforth, 1946; Directorate 
for Selection of Personnel, 1947). Most boards dealt with 120 or 
with 64 candidates a week. Though there were many variations 
between boards it was usual for candidates to visit for three days. 
During this time they filled in a biographical and a medical ques¬ 
tionnaire, and were given certain intelligence and personality tests, 
the latter being of the projection type. A proportion were inter¬ 
viewed by a psychiatrist, and all ^ere interviewed by the president 
or deputy president—^senior Army officers. Each batch of eight to 
ten was under the charge of a unitary testing officer (M.T.O.), 
who messed with them, and did his best to put them &t ease and to 
observe their natural social behaviour. He too applied the series of 
practical tests such as group discussions and lectmettes, command 
situations, obamde cotirses and “leaderless group” tests deigned 
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to bring out the candidates’ initiative, cO'OperativenesB, leadership 
and other social qualitiM. Often a visiting officer, with the presi¬ 
dent and other members of the board, observed some of these 
tests. To candidates for technical Arms such as R.E, and R.E.M.E., 
a mathematics paper might be given and a further interview with 
a specialist officer to enquire into their engineering or other experi¬ 
ence. At the end of the period a conference was held between all 
members of the board where the judgments of M.T.O., psychia¬ 
trist, tedinical officer if any, and president were compared and 
' discussed, and doubtful casea were considered in detail before the 
president gave his final decision. 

We can beat study the method more fully under the headings-J 
the functions of the president, psychiatrist, psychologist and 
M.'Il.O.*. 


The President 

At W.O.S.B.s the problem of the Army versus the technician 
was intensified. The new methods raised far more controversy and 
opposition than did any other aspect of personnel selection. As in 
the Navy, criticism came from regular officers rather than from 
candidates. It was clear that the latter considered the new boards 
a vast improvement on the old, for applications for commissions 
greatly increased in number. Time and again candidates—even 
failed ones—commented on the fairness, friendliness and thorough¬ 
ness of the scheme. They even resented having to forgo the 
psychiatric interview, though this was the feature that aroused 
most suspicion. Any candidate who did think himself unfairly 
treated was at liberty to apply again and to come before the same, 
or another, board, which took no account of his previous failure. 
We do not wish to imply that Army officers were biased. Their 
opposition was very natural, since it was they, as commanding 
officers, who in battle had to entrust the lives of their men to the 
junior officers accepted by the boards. Once they had seen the 
scheme working, the great majority were converted in its favour. 

For these reasons, it was thought that no scheme entirely run 
by technicians could prove successful. As Sutherland and Fitz- 

* The present writer is indebted to,members of the staff of the W.O.S.B. 
Research and Traininc Centre for information, and to a paper on "The Growth 
and Development of War Office Selection Boards," read to the British Psycho¬ 
logical Society, Scottish Branch, Oct. 1646, by A, 0. Heard. The views expressed 
here Bte, however, bis own. 



66 


ARMY OFFICER SELECTION 

Patrick (1946) put it, this would have meant introducing “a foreign 
body into the tissues of the Army.” Though the aim was to educate 
the Army gradually into accepting scientific methods, the com¬ 
promise eventually achieved showed considerable technical defects. 
Psychiatrists, psychologists and M.T.O.S were technical advisers 
to the president; and each president could run his board as he 
wished, with as much or as little reference to the technicians as he 
wished, subject only to the controlling authority of the Director 
for Selection of Personnel, himself a professional soldier. Hence, 
the president, representing the Army, was responsible for the final 
decisions; hence also a major part was played by M.T.O.s who 
were regimental officers. This meant considerable dependence on 
the subjective judgment of a single man, and considerable diver¬ 
gence between different boards. The president (or his deputy) did 
have a detailed questionnaire on educational, employment and 
Army history, and the C.O.’s opinion, as a basis for his half-hour 
interview, but no standard scheme of interviewing was followed. 
Chiefly he attempted to gauge “officer quality” and “leadership” 
by enquiring into interests and past achievements, and the can¬ 
didate’s attitudes towards an officer’s rdles and responsibilities. In 
addition, he tried to assess the relevance of the candidate’s quali¬ 
fications to his chosen Arm and could, if he thought fit, recommend 
his acceptance for some different Arm. Differences between boards 
greatly hampered all subsequent attempts to validate the pro¬ 
cedures (cf. pp. 122-127), and led to significant discrepancies in 
pass rates and distributions of grades, even though these were 
considerably smaller than those found among old-type boards. ^ 
Thus, in one experiment where strictly comparable groups of 
candidates were sent to two boards, the respective pass rates were 
23 per cent, and 48 per cent. This, however, was probably an 
extreme case. / 

The Psychologist 

The chief technical officers consisted of five or six fully qualified 
psychiatrists and psychologists, some senior M.T.O.s, and several 
trained assistants. They were responsible for developing and 
improving the methods used by all the boards, but as their func¬ 
tions were advisory only, they had very little control. For the most 
part they were isolated in the research and training centre (R.T.C.), 
where they were not given any opportunity, until the end of the 
war, to carry out day-to-day selection themselves. They had little 
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contact, also, with the technical staff of D.S.P. working at the War 
Office, since ffieir approach-based mainly on psychopathology 
and on psychological "field theory” (cf. p. 61)—was so very 
different from that of headquarters psychologists—based more on 
industrial psychology and psychometry. 

When the boards started, there was no possibility of finding 
fifteen or more psychologists qualified to undertake detailed diag¬ 
nostic testing and interviewing, though later many of the more 
promising sergeants were commissioned and given increased 
responsibilities. At first then, apart from tire psychological staff of 
!r.T.C., board psychologists were sergeants, tlnee of whom acted 
as assistants to each psychiatrist; they did not attend the final 
conferences. Another reason for the minor r61e of psychology was 
that officer-suitability appeared to be chiefly a matter of character 
and personality, and in the absence of objective tests of the 
desired qualities, interview techniques which psychiatrists them¬ 
selves had evolved successfully at Edinburgh constituted the most 
promising approach. 

The sergeants’ functions were not, however, purely psycho¬ 
metric. They applied and scored Intelligence tests, but also gave 
the projection tests and to some extent interpreted the results of 
these, with a view both to separating off the candidates on whom 
the psychiatrist could most profitably concentrate from those 
whom he need not interview, and to feeding him with information 
or "pointers” about the former. 

Three twenty-minute group testa were used, namely, the Army 
' V.I.T. (Verbal Intelligence Test)—an omnibus test of conven¬ 
tional type, and new and harder versions of the Matrices and 
Shipley Abstraction tests. Raw scores on these were converted into 
' equivalent scores, the sum of which was then converted into an 
officer-intelligence rating. This rating ranged from 10 to 0, with 
3 corresponding approximately to the median for the general 
Army population, and 7 to the median for officers. In general, no 
candidates with O.I.R. below 3 were accepted. On the other hand 
very little stress was laid on a high rating; candidates scoring 10 
were not looked on any more favourably than those scoring, say, 6. 
Several other tests were available for special purposes such as the 
Wechsler, Mill Hill and other vocabulary tests, and the Trist- 
Misselbrook Kohs Blocks. Experiments were carried out with 
Weigl Sorting, Carl Hollow Square and other diagnostic tests, 
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particularly in the early days of the Edinburgh Board, though no 
definitive results were obtained. An assembly test for engineering 
cadets was devised to yield qualitative ratings of mechanical 
interest and comprehension, methods of work, etc., rather than 
quantitative scores. A detailed information test on workshop pro¬ 
cedures was also prepared for R.E.M.E. cadets. 

In 1944-6 standardised educational achievement tests were con¬ 
structed for such groups as schoolboy candidates for university 
short courses, which were intended to supply potential ofiiqers for 
Arms, and for candidates for Army College courses. 
These included objective examinations in arithmetic, algebra, 
geometry, calculus, heat, light, electricity, mechanics, general 
science, current affairs in history and in geography. The tests were 
of school certificate to higher certificate standard. > 

The first questionnaire has been mentioned above. Its primary 
aim was to indicate the candidate’s opportunities and what he had 
made of them. The second asked for various details of medical and 
family history, and was seen only by the psychologists and psychia¬ 
trists, To some extent this acted like die personality inventories 
in which neurotics tend to check large numbers of symptoms (cf, 
p. 267). Candidates were also required to write two brief descrip¬ 
tions of themselves as seen first by a good friend, secondly by. a 
severe critic. Sergeants were trained to extract from these docu¬ 
ments the points of major psychological interest. In the two group 
projection tests, direct self-reference was intentionally avoided. A 
fifty-word free association test was given with the title "Test of 
Quickness of Imagination." The stimulus words were presented 
visually for fifteen seconds each, candidates being instructed to 
write down the first idea that came to mind, either words or 
sentences. Some of the words were originally chosen to have 
ambiguous meanings, e.g. "lead,” it being hoped that the more 
suitable candidates would' react to their military meanings. But 
this feature possessed no diagnostic worth. 

Lastly, six of Murray’s Thematic Apperception pictures were 
shown by slides for four minutes each. Candidates were instructed 
to write short stories describing the social situations suggested. 
Sergeants received training, both at R.T.C. and from their own 
psychiatrist, in the interpretation of responses to these projection 
tests, neither of which was objectively scored. On an average, 
half-an-hour was available for reading the questioimaires and 
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projection responses and integrating their deductions into a sum- 
maty of the salient personality features, according to a more or less 
standard scheme. Note was made of the frequency of appearance 
of such attitudes as anti-social, disruptive or cohesive, also of 
undue morbidity, or of special types of symbolism. Degree of self¬ 
insight, breadth of outlook, and extent of identification with the 
Army’s nfip ds were assessed. Admittedly the technique was largely 
intuitive, and, although investigations into reliability, and validity 
- were planned, it was not possible to complete them. Nevertheless, 
ft moderate dejgree of consistency was found between different 
sergeants judging the same material. Moreover, when a large 
number of psychiatrists and psychologists independently inter¬ 
preted the pointer material of 38 candidates, the pijomising correla¬ 
tion (cf. p. 104) of 0.58 was obtained between their modal judgment 
and ^e final board grading. 

With the provision of more commissioned psychologists towards 
the end of the war, it was arranged that candidates whose pointers 
were not followed up by psychiatrists, and whose personalities 
appeared less complex, should be interviewed by a psychologist, 
though his province was strictly limited to “surface” traits and 
sentiments. At the end of 1946 the shortage of staff became so acute 
that both psychiatrists and commissioned psychologists were with¬ 
drawn from the boards altogether. 

The Psychiatrist 

Smce there was only one psychiatrist at most of the boards, and 
since his interviews, lasted from 20 to 60 minutes, the proportion 
of candidates seen had to be limited. But an additional reason for 
this was the manifest desire of Army authorities to reduce his r61e 
to a minimum. While their suspicion was natural enough, it was 
certainly misguided. Only those who were ignorant of board pro¬ 
cedures and of the nature of the discussions at the final cnuference 
could continue to believe that psychiatrists spend all their timp 
smelling out sexual complexes. It is true that in occasional cases 
of superficially well-adjusted personality the psychiatrist might 
* discover psychopathic or neurotic tendencies and recommend 
their rejection. But at least ^ often he drew attention to underlying 
potentialities and strengths in outwardly unimpressive or di^dent 
personalities which other members of the coifference had been 
inclined to reject. His services as a purely medical examiner were 
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also sometimes needed. But the main, value of the psychiatric 
approach, and the reason for its success in the early experiments 
may, perhaps, be expressed as follows. 

The boards were concerned to select candidates who would: 

(i) Pass their O.C.T.U. courses, and acquire the necessary 
technical proficiency. 

(ii) Stand up well to the stresses of battle, showing resource, 
leadership, aggressiveness and cadtion when these qualities 
were needed. 

(iii) Look after his men well and gain their confidence, 

(iv) Co-operate effectively with other officers and make a good 
impression on his seniors. 

Several oAer rdles might be mentioned, but these are among the 
chief ones. With most of the candidates little direct evidence was 
available, particularly regarding (ii) and (iii), thus the board had 
to predict future behaviour, often in young and immature person¬ 
alities, from the indirect indications of past history. Now it is the 
medical psychologist—particularly the psychoanalyst—who has 
been chiefly responsible for introducing the viewpoint, generally 
accepted by contemporary psychologists, that our behaviour is 
never a random or chance reaction to the circumstances of the 
moment, but is always determined by or arises out of that organisa¬ 
tion of innate and acquired tendencies which we call the total 
personality. Further, that we are not conscious of the true nature 
of many of the most important of these tendencies. Thus a more 
complete explanation of our present characteristics and a more 
accurate prediction of future trends is likely to be obtained by 
exploring the underlying mechanisms, and this the medical 
psychologist is particularly well fitted by his training to undertake, 
though others, such as the normal psychologist or the literary 
artist, are often capable of equally penetrating diagnoses. Having 
studied abnormal and disintegrated personalities the psychiatrist 
is quick to recognise the relatively well-balanced and integrated 
personality, and one of the principles upon which W.O.S.B. 
psychiatrists worked was that it is the harmoniously organised 
man who is most capable of showing aggressive or cautious, tactful, 
deferential, firm or sympathetic characteristics whenever these 
may be needed, and is least likely to give way under stress. Finally, 

. it will be noted that most of the desirable officer qualities consist 
of-appropriate reactions to people; there is little of the job which 
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will not be carried out in a social context. Medical psychologists 
believe, rightly or 'wrongly, that adult social reactions are based 
largely on the attitudes to parents and other humans established 
in infancy and early childhood, hence the relevance of their 
enquiries into family history and childhood memories. Among 
' A.T.S. officers the social side is if anything more important, since 
they are not usually required to be technical specialists and will 
seldom be subjected to operational stress. Thus the personality 
organisation and attitudes to people desirable in a woman officer 
(which may be very different from those in a man) was the subject 
of special study, largely by women psychiatrists. 

>' The Military Testing Officer 

The M.T.O. was originally introduced as a “cover plan”; he 
Was to apply tests of a military nature which would impress the 
Army while the real job of selection was undertaken by the tech¬ 
nicians. However, his functions were progressively clarified, and 
their value increased, through the investigations at the research 
and training centre. In the first experimental board he tried 
put some of the practical situation tests used in Germany, and 
improvised others. For example, he took his group of eight 
‘ candidates to a stretch of country and there posed a series of 
tactical or other problems to candidates in turn. Each candidate 
again might be put in charge of the rest so that he could .display his 
handling of them in military situations. Obviously, the success of 
candidates at such tests is Likely td depend as much on their pre- 
, vious training and experience as on their personal resourcefulness 
or other qualities. Group athletic games and elaborate obstacle 
courses were also devised, with a view to bringing out co-operative¬ 
ness, aggressiveness and "guts,” quick judgment, endurance and 
the like; but physical status and bodily skills probably entered too 
largely. Lecturettes, given with or without preparation, and 
informal discussions were supposed to show fluency, confidence 
or social aplomb, together with acceptable or unde sirable attitudes 
towards the topics discussed, e.g. discipline. M.T.O.s were usually 
men with battle experience, whose age was fairly close to that of 
the majority of candidates. Hence their personal experience of the 
work for which candidates were being selected was of value. But 
they had no systematic instruction in the assessment of personality 
or avoidance of halo. Although many of the tasks appealed to the 
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candidates’ sense of fitness and fairness—also to outside critics— 
they were quite unstandardised and unscoreable, and their 
diagnostic worth was extremely dubious. 

In the view of the R.T.C. psychologists, the search for particular 
traits characteristic of officers, and for tests to measure these traits 
was likely to be a waste of time. Influenced by the field theory 
approach of Lewin, Moreno and other writers in America, they 
could not accept traits as predominantly constant qualities of an 
individual, existing independently of the context in which they are 
expressed. Successful officers do not all show the same traits; thus, 
it would seem to be the total configuration of traits in their person¬ 
alities rather than the individual traits which makes for success. It 
follows that a candidate should not be thought of as possessing a 
certain amount of leadership, which he can display both in test and 
in real life situations. His personality is an organised whole, a 
system of tensions or needs, which interacts dynamically with the 
varying demands of different situations. “Officer quality’’ should, 
therefore, be analysed in terms of the main rdles that future officers 
will be called on to play. By setting appropriate tasks a similar 
system of forces c^ be set up at the W.O.S.B., whose interplay 
can then be observed by the M.T.O. or other board member. The 
candidate’s most important r61e vdll be that of leader of a small 
group, and he should be able to uphold his own position in such 
a group, to ^ive the group direction, and at the same time maintain 
its cohesion or solidarity against internal or external disruptive 
forces. 

It was from this theoretical background that the basic series of 
leader less group tests was evolved in 1943 by Bion—tests which 
constituted the most original feature of W.O.S.B. procedure. In 
a leaderless group test, a group of, say, eight candidates as a whole 
is given a task, and left to work out its own solution. No leader is 
appointed, and the M.T.O. is an entirely neutral observer. Each 
candidate is out to distinguish himself, but the task is so designed 
that this is only possible through the success of the group. In other 
words he is forced to submerge himself and serve the group’s 
interests. Each candidate resolves this conflict in his own indi¬ 
vidual manner. Some become mere “passengers” who give up 
their ovm desire to shine; others become “thrusters” who subordi¬ 
nate group interests to their own. More effective solutions, lying 
between these extremes, are shown by those who spontaneously 
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attract the confidence of their fellows, and who come to the 
fore although working for the group’s success. Thus the reactions 
of candidates to such situations are considered to be diagnostic of 
their behaviour towards their platoons under Active Service con¬ 
ditions, Bion (1946) writes: ‘"riie essence of the technique ... was 
to provide a framework in which selecting officers, including a 
psychiatrist, could observe a man’o capacity for maintaining 
personal relationships in a situation of strain that tempted him 
to , disregard the interests of his fellows for the sake of 
his' own.” 

The tasks were designed to be “real” ones which would bring 
out the candidates’ natural behaviour, not artificial test situations. 
To the candidates or the uninitiated observer they appeared to 
' pose a practical problem, but this, of course, concealed the under¬ 
lying social problem in which the selection staff were interested.' 
' The basic series of tests, although run by the M.T.O., came to 
provide a common meeting place for all members of the board. 
They found that they could more effectively apply their particular 
tecbiiiques after this preliminary pbservation of the candidates’ 
personalities “in action.” The tests provided an approach comple¬ 
mentary to that of the intervieYrers. The psychiatrist, for example, 
could link up the abnormal adaptations of candidates to one 
another with the adaptations to family, school and occupational 
' environment traced out in his life-history interview. The series 
included such tasks as moving a heavy object over a series of 
. obstacles, or erecting some apparatus, 6r having a discussion or 
debate on some topic, or plaiming an administrative scheme—say 
a system of training for future recruits. The situations were so 
; arranged as to display the formation of initial social contacts 
' between members, their behaviour when co-operating on a com¬ 
mon task, their reactions to ihtenral disruptive forces (competition 
between members) and to external forces (competition of one 
group with another). 

Later, other tasks designed to reproduce such officer rdles as 
independent command, as instructor, or administrator, and as 
subordinate to senior officers, were posed by the M.T.O. In 
selecting specialised personnel such as applicants for regular 
Army commissions, colonial administrators, university short 
course candidates, and the like, other appropriate “work-sample” 
situations were devised. Sometimes a task extended over three 
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hours, the group having to study certain material, plan its course 
of action, and report. 

In the early boards, each member studied the candidates inde¬ 
pendently until the final conference. This led to a somewhat 
unnatural atmosphere, since members felt constrained not to ' 
■ discuss the current group even informally, and serious disagree¬ 
ments sometimes occurred at the conference, when it was too late 
to make any further investigation of the doubtful candidates. Thus 
the introduction of collaboration and mutual consultation at all 
stages was found to have considerable advantages. Often most of 
the members would recognise certain candidates as clear passes or 
fails quite early on, and so feel free to concentrate their study on 
' the borderline or controversial cases. The system had, however, 
certain drawbacks from the scientific angle, since it became impos¬ 
sible to investigate the reliability or validity of the contributions of 
the various parts, of the board procedure. 

Miscellaneous W.O.S.B. Activities 

Another function primarily carried out by M.T.O.s was the 
important one of preparing job analyses, covering the life and 
duties of each of the main types of officers to be selected. Later 
in the war, R.T.C. and the boards undertook numerous more 
specialised jobs including: 

(1) Selection of adolescents for eventual training as officers after 
periods of study at universities or technical colleges. 

(2) Diversion of promising candidates who appeared too imma¬ 
ture for commissions to special courses of training designed 
to develop “leadership.” 

(3) Selection ,of experienced personnel for special duties— 
Palestine Mobile Police Force, Civil Affairs officers for 
administration in the Far East, etc.; also classification of 
men engaged in, the psychological warfare branch. 

(4) Guidance of certain classes of officers into suitable employ¬ 
ment, including psychiatric disability cases, officers who 
were surplus in their' old jobs, officers who had received 
adverse reports, and—^in particular—^repatriated prisoners 
of war. The latter were dealt with in Officer Resettlement 
Units, where tests were dispensed with and everything was 
planned to reassure the men and to readjust them to normal 
life. Similar principles were extended to the handling of 
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other rank ex-prisoners in Civil Resettlement Units, and to 
all demobilised soldiers who felt the need of help in their 
readjustment. Accounts of this extensive experiment in 
psychotherapy and its successful results have been given by 
Wilson (1946) and Curie and Trist (1947). 

■ (E) Selection of war-time temporary officers for permanent 
commissions in the regular Army, and the development of 
a technique for selecting temporary officers in the peace-time 
conscript Army. 

(6) Assistance was given also in the selection of Royal Marine 
Corps officers, paratroop officers, and in an experiment in 
the organisation and methods division of the Treasury. 

Since the war many of the R.T.C. staff have joined together in 
the Tavistock Institute of Human Relations, and are applying the 
principles and techniques evolved in the Boards to the study of 
industrial, educational and other civilian problems (cf. Jaques, 
ctflZ., 1947). 

Civil Service Selection Boards (C.S.S.B.s) have also been set 
up to advise the Civil Service Commission on the choice of can¬ 
didates for the administrative class and foreign service. But these 
differ markedly from W.O.S.B.8, Their chairman and observers 
have somewhat analogous functions to those of the president and 
M.T.O.s, but psychiatrists are not employed and the psychological 
staff consist of ex-industrial psychologists. Group discussions, 
planning problems and the like are applied, though with the object 
of throwing light more on the quality or calibre of the candidates’ 
intellectual powers than on their social adjustments. 

Discussion 

The reader may conclude that W.O.S.B. methods were essen¬ 
tially similar to those of German military psychologists, which we 
condemned in Chapter I. But in Britain no use was made of 
elaborate reaction time and other tests alleged to reveal a whole 
series of faculties or powers of the mind and character. Moreover, 
much more regard was paid here to the need for validation and 
follow up, even though the results (summarised in Chapter VII) 
were not very gratifying owing to the extremely difficult conditions 
under which they were conducted. It is the M.T.O. side of 
W.O.S.B. procedure which most closely resembled the German 
counterparty and which is probably most open to criticism. None 
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of the leaderless groups or other M.T.O. techniques, were stan¬ 
dardised tests; no measurements were taken, and only to a limited 
extent were observations recorded under any standard scheme. 
Although these techniques aimed to get away from what Gestalt 
psychologists such as Brown (1936) call “CIms Theory”, i,e. the 
analysis of personality into a number of self-sustaining traits, it is 
likely that most of the psychologically untrained observers con- 
‘ tinned to treat them as tests of leadership, co-operativeness, and the 
like. Again, field theory is as yet largely a priori and empirically 
unverified even though it is not—^like German theories of character 
—^unverifiable. There is no evidence that the consistency ofs 
"reactions to group tensions,” is any greater than the consistency 
of conventional personality traits. We know, for example, that a 
candidate who displays "leadership” in an artificial test situation 
will not always show the same trait in real life, but it is doubtful 
whether his performance in leaderless group tests will be any more 
prognostic. Indeed, such tests are open to serious criticism from 
field theory itself. Every group of candidates constitutes a unique 
combination or Gestalt, hence any one candidate’s behaviour is 
likely to vary considerably with the make-up of the group. Though 
faced with objectively the same task, his reactions would probably 
differ if he was tested along with a different set of candidates. This 
objection was, however, partially met by the common practice of 
re-grouping candidates in later series of M.T.O. tests. As there 
were always several sets running simultaneously at a W.O.S.B., 
the best one or two from each set might be put together, or all the 
borderline candidates, and so on. 

Another fundamental difficulty is that the intense desire of can¬ 
didates to pass leads them to "put up a show” which may not be 
typical of their everyday behaviour. All sorts of distorted accounts 
of what the boards were looking for used to circulate among 
potential candidates, and they would naturally tend to mould 
themselves accordingly during the two or three days examination. 
It is known, for example, that many arrived with ready-made 
stories to fit the thematic apperception pictures, based on accounts 
gleaped from previous candidates. 

In justification for some of these weaknesses, it should be 
pointed out that the W.O.S.B. system was not developed, and 
should not be regarded, purely as a diagnostic technique. Under 
the. circumstances of the war, the fact that it won wide acceptance 

P.S.—3 
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from the Army and stimulated the recruitment of candidates-^in 
other words its psychotherapeutic effect—was at least as important. 
It was extremely valuable in that the Army was led to believe that 
it was thereby getting the best possible officers, even though its 
methodological shortcomings were such that it probably made 
numerous incorrect choices. But other selection techniques too are 
far from perfect, and, like other techniques, it was certainly an 
improvement on older methods. 

Since the war there has been a growing tendency for the selec¬ 
tion of managerial, professional and other high-grade workers to 
be modelled; to some extent, on the W.O.S.B. plan (cf. Fraser, 
1047; Hoovers, 1048). One to three day meetings are held instead 
of half-hour interviews; psychological tests, group discussions and 
exercises are introduced, and a qualified psychologist usually 
assists the employers’ selection committee. This has severd 
advantages: 

(1) It provides a much larger sample of the candidates’ behaviour 
/ on which to base judgments. 

(2) It enables the selectors to observe the social interplay of 
candidates collaborating or competing with one another. 

, (3) It appears mmfestly fair to candidates and to employers, 
and so stimulates confidence in the system, whether or not 
this system is entirely sound scientifically. ^ 

But a good deal of caution is needed in adopting the W.O.S.B. 
methods since, as shown later, their value has not'been sufficiently 
thoroughly demonstrated. It should not be supposed that the 
introduction of a few situational tests, apparently analogous to 
situations involved in the job, provides the key to accurate assess¬ 
ment of personality and job suitability, nor that the fallibility of 
human judgment and the need for scientific validation are lessened 
thereby. 



CHAPTER V 

SELECTION IN THE ROYAL AIR FORCE 

Although aviation psychology had been started in most 
Air Forces during the 1914-18 war, research did not continue after 
the Armistice, with the result that in 1939 the R.A.F. was almost 
entirely dependent on the unstandardised interview in its chief 
selection situations. 

In 1940 a few psychometric tests were mtroduced under the 
auspices of the Air Ministry Medical Branch to aid the Aviation 
Candidate Selection Boards. This testing, procedure, was under 
the te^icd superwsion of the Cambridge University Psychology 
Department where the first groups of W.A,A.F. testers (clerks 
personnel selection) received their training, 

In 1941 Ground Trade Selection tests (including G.V.K.) were' 
devised and introduced by Stephenson (Oxford University) who for 
two years acted as personnel adviser to the Central Trade Test Board. 

Finally, in 1943, the technical responsibility for all selection test 
procedures became centralised at Air Ministiy, where Bott 
(Toronto University) had founded a training research branch late 
in 1941. The terms of reference of this branch (now known as 
Science 4) made it responsible for advice on all psychological 
matters arising in the Air Force outside the realms of physiology 
and psychiatry. 

The early energies of Training Research were very largely devoted 
to the reduction of wastage in air crew (especially pilot) training. 
In 1942 grading (standard flight testing) was introduced; it resulted 
almost immediately in a 60 per cent, drop in pilot training wastage. 
Two years later a large battery of tests was assembled with the 
purpose of obtaining aptitude indices for each category (pilot, 
navigator, etc.) in respect of ail accepted air crew candidates. 
Reduction in training programmes since 1944 has made validation 
difficult, ,but large samples have been followed up wherever 
possible. 

t in recent years researc h has been made int CLt he personality ( as 
di 8tinctTriMnllm^lII)~^ of air crew .'miSnical ^ort 

67 ' 
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has also been expended in the selection of the officer and appren¬ 
tice populationSi while at the moment an amplified ground trade 
test battery is in preparation. 

In addition a good deal of work has gone into the problem of 
training assessment (both ground and air),while two other highly 
important fields (instructional technique and morale survey) claim 
most of the residual energies of an all too small branch. 

The seeds fir-sfc-World 

War, ItalyIJiSeinelli) and the United States (Henmon and Thorn- 
di£S5“m£EE!igTEeT5o§rKffistan^"COtti^butions. Butf'ttffifOT 
atelyrpycholog^ts were nowhere given a ^ance to follow up their 
openings in the years after 1918 with the result that^the second 
conflict found the majority of Air Forces unprepared for sudden 
large-scale expansion. Thus, in 1939, the R.A.F. was relying 
almost everywhere on traditional methods to select and place its 
men. The claims of air crew applicants were vetted by a series of 
Service boards who so far as interviewing was concerned were 
unselected, untrained and supplied with no technical aids. Simi¬ 
larly .the selection of officers was entrusted to boards with little to 
guide them beyond the knowledge that all the candidates they saw 
came with some sort of recommendation from a commanding 
officer. Ground trade applicants who claimed relevant civilian 
experience were tested in respect of these claims by expert trades¬ 
men ; but these same tradesmen were required, with no special 
training, to undertake the much more general problem of assessing 
the potentials of the great majority of recruits who made no claim 
to any special skill. Such very briefly was the picture of selection in 
the first autumn of the war, and it has to be admitted that some of 
the features indicated were slow to change appreciably even after 
the notion of selection as a scieiice had taken fairly firm root. 

The introduction of systematic selection in the Air Force was 
more gradual and less unified than in either the Array or Navy. 
Three distinct phases are to be noted: 

(i) 1940-2. Air crew selection under the technical guidance of 
F. C. Bartlett (Cambridge University), introduced and 
sponsored by the Medical Branch of the Air Ministry. 

(ii) Ground trade selection initiated and developed by W. 
Stephenson (Oxford University) acting as personnel adviser 
to the Central Trade Test Board, West Drayton. ' 
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(iii) December, 1941. Arrival at the Air Ministry of E. A. Bott 
(Toronto University) to found Training Methods (later 
Training Research and now Science 4). From this point all 
work on selection and training research gradually became 
centralised under a single authority. 

The early work can most easily be discussed by considering 
each of these phases in turn. 

The "Bartlett Tests ."—^Early in 1940 the Cambridge Psycho- 
logic al Laboratory was invited to pro duce a general intelligence 
test jto be gi yej}_,at thn Aviation'Candidate Selection Boards. The 
time allowance (16 minutes) was hardly liberal, but the introduc¬ 
tion of any standard test, however brief, must be accounted a step 
forward. Three parallel forms of a 20-item verbal intelligence test 
(G.I.T.) were prepared and made available to all A.C.S.B.S in the 
middle of 1940. The tests were at first administered by orderlies, 
but it quickly became obvious that specially trained staff were 
needed, and a trade (clerks personnel selection) was created to 
meet this requirement. The results of the tests, expressed in 
6-letter grades based on a 1-2-4-2-1 division of the scores of 
applicant populations, were available for the guidance of board 
presidents, but no cut-off levels were laid down, the presidents 
being free to use^hgir discretion in interpretation. General advice 
wa s, however, given on the, scores-belaw Which admission to (a) 
pilot and navigator categories, (b) wireless operator and air gunner 
categories was deemed problematical. 

At the same time as the G.I.T. tests, a 16-minute elementary 
mathematical test was introduced and used as a very approximate 
guide to educational attainment. Both the G.I.T. and E.M.T. had 
the merit of being objectively scored, the scores being evaluated 
against standard grades, A third written test (general knowledge) 
consisted of two fifty-word essays the purpose being to afford some 
clue to powers of expression and alertness to current affairs. Only 
a very slight reliance on this test was, however, encouraged, the 
risk of subjectivity in marking being considerable. 

The three tests just described remained part of the standard air 
crgw^se lection procedure fox more than three years. They may not 
have made a large contribution to the training wastage problem, 
but they undoubtedly helped to make the Service test-conscious, 
and in doing this paved the way for the much fuller testing pro- 
grammies of later years. In the be ginnin g of 1942 the battery was 
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augmented by the S.M.A.3 (pilot co-ordinatioU) test (cf. p. 248), 
and a aub-battery of three testa—directions, tapping and morse 
apfitude—aimed at the elimination of unsuitable wireless operator 
trainees. S.M.A.3 was the work of G. 0, Williams (Central Medical 
Establishment, R.A.F.), while the wireless battery was assembled 
by G; C. Drew (Cambridge University), who later came to do full 
time work at the Air Ministry. The latter tests were introduced 
executively following an experiment at Blackpool which led to a 
' sharp reduction in wastage in the basic signals course. 

The ^‘G.V.K." Tests. —In the early part of 1941 Stephenson 
came from Oxford to take up full-time duties as personnel adviser 
‘ (ground) to the Central Trade Test Board, with whom he remained 
for more than two years. Very little experience of trade allocation 
problems convinced him of the need for a test whose first function 
would be to sort the large and diverse weekly intakes into some half 
dozen broad ability zones. For this purpose categorisation on the 
most general lines was required', the question of group, and specific 
abilities not being of paramount interest at this stage. Because, 

, however, sharply contrasted trades, e.g. clerical and mechanical, 
were liable to make demands for persoimel at each ability level 
there was a dear advantage in designing a two-purpose intelligence 
test which would give pointers in respect bodi of general ability 
and of strong leanings towards practical or theoretical work. The 
three-part test known as G.V.K. is well designed to do both these 
things. A reliable estimate of a recruit’s gdieral level can be 
obtained by averaging the sum of his three scores. But as the “V” 
section has a considerable language (as well as verbal intelligence) 
content it serves as a rough guide to educational attainment and 
' suitability for sedentary occupation, while “K” directs attention 
to the possession or non-possession of practical-mindedness. A 
simple arithmetic test was added to this short battery which in all 
occupied rather less than an hour of testing time. 

^ To understand how the above tests were used it should be 
exp|lained that during the war years the ground trade problem was 
one of trade recommendation, not of overall acceptance or rejec- 
tioh. In this it differed sharply from the air crew situation where 
initial rqection rates rarely sank below 60 per cent. But the R.A.F. 
ground population was in effect no different from the population, 
applying for entry and the divergences of both from the civilian 
- population were probably small (cf. p. 198). 
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Recommendktiona to ground trades were made by interviewers 
chosen primarily for their experience in certain technical trades. 
These were the people whose task the test battery was intended to 
assist. For such aid to be effective, guidance was given on the mean 
test scores foimd among trained personnel in each R.A.F. trade, 
and oh a “failure-” score for each trade, i.e. a score below which 
success in'training had been demonstrated to be small. (Samples 
of this are given in a later Chapter.) Scores were presented in 
percentile form, the basic population being a, random sample of 
6,000 drawn from early 1942 intakes. The norms established at 
this date have been held constant ever since, but revision in terms 
of current N.S.A. and volunteer intakes is intended. 

Specialised tests were devised only in respect of one or two 
priority trades, e.g. radio wireless mechanics. These were given 
at the interviewer’s request to recruits found potentially suitable 
for such trades. It will doubtless be thought that a battery of four 
short general tests and two or three special ones is a very slight 
instrument with which to attempt allocation to more than a hun¬ 
dred occupations. Unquestionably an ampler battery would 
be desirable and it is hoped to introduce one in the com¬ 
paratively near future. Meantime one can note the ingenuity 
and economy with which the G.V.K. test performs its double 
function. 

Training Research .—Both air crew and ground trade selection 
were thus already under way by December, 1941, when Bott, with 
C. R. Myers as his assistant, took up his Air Ministry post as adviser 
to the Air Member for Training. For twenty months an embryonic 
branch was concerned almost exclusively with air crew problems, 
particularly with that of drastically reducing pilot training wastage. 
Its terms of reference were, however, exceedingly wide; so wide, 
in fact, as tp cove^ almost any application of psychology anywhere 
in the Air Force. It was thus possible when Stephenson relin¬ 
quished his post in July, 1943, for ground trade selection to be 
brought under the same head as the selection of air crew, while the 
fact that the advisers had the right of entry to both selection and 
training problems made possible a longitudinal survey of the air¬ 
man’s whole career. That the study of selection and training 
has proceeded unevenly, the bulk of enquiry in the early years 
having been directed to the former, must not be ascribed to any 
maldistribution of effort. It has resulted simply from the fact 
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that it is comparatively wasteful to uhdertake detailed research 
into training methods until something has been done to ensure 
that approximately the right people are entering training. 

Between 1942 and 1946 Training Research (the name was altered 
from Training Methods early in 1943) expanded from a section 
with three or four research workers to a branch with a provisional 
scientific establishment of seventeen. During most of this period 
tprlmiral staff Were recruited by a series of special appointments, 
but in the autulnn of 1944, it was decided to bring these temporary 
appointments within the framework of the Scientific Civil Service. 
During a large part of 1946 nearly all the established posts were 
filled, though lack of civilian psychologists led to a number being 
taken by serving officers with psychological qualifications. Since 
the end of the war, however, the strength of the branch has 
declined by over 80 per cent,, the chief reasons being demobilisa¬ 
tion and delays in deciding the future status of psychologists in 
Government employment. At the moment of writing (December, 
1947) the small staff remaining is being forced to spread its energies 
among a greater number of problems than it can possibly do full 
justice to, while other problems, some of great urgency, cannot be 
tackled at all. 

From July, 1943, when A. J. Marshall took over Stephenson’s 
work on ground trade selection, the branch was divided into two 
sections, the first dealing with all air crew matters (whether selection 
or training) and the second with ground trade problems. Later 
re-organisation brought all selection problems (air and ground) 
together, a second section being set up to deal with training 
problems. 

The functions of the branch have throughout been advisory. In 
the early days responsibility was to the Air Member for Training 
alone; later, as the emphasis shifted back to the earliest stages of 
selection and as the desirability of a centralised selection system 
came to be appreciated throughout the Air Ministry, advice was 
increasingly sought by the Air Member for Personnel also*. 

* In 1047 a Deputy Directorate of Selection was fonnally established to con- 
ttol the policy and administration of all selection procedures throughout the 
Service. This personnel branch had, in previous years, done much to raise the 
general standard,of seiyice interviewing, particularly in the fields of resetection 
of air crew traininp; failurea, reallocation, of tour-expired air crew and the dis- 
poaal of former prisoners of war. 
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Fui^y, as Science 4, the branch became,part of the Scientific 
Advisers organisation and through him responsible to the Chief 


Training of Testing Staff 

A short account of the selection and training of testing staff mav 
be of mterest. From 1942 to 1946 Clerks P.S. consisted Llusively 
of women, but the quickened release of W.A.A.F. brought about 
an almost complete change-over to R.A.F. personnel. It is antici- 
pated that the final constitution of the trade will provide for equal 
proportions of men and women. As regards selection two mpi'n 
points have been established. First, it has been found unprofitable 
to train anybody below the top quarter of the R.A F /W A A F 
population in general intelligence. Secondly, great pains have'to 
be taken to ensure that every volunteer knows what the duties of 
the trade involve The second point may seem obvious, but very 
little ^perience showed that the notions of many early applicants 
as to the nature of psychological work were wild and sometimes 
crude. A number assumed, rather naturally, that personnel selec¬ 
tion du les entailed a great deal of individual interviewing, while 

JfSiisin forward to a banquet of Freudian 

personnel received their training in courses origin- 
aUy of three weeks duration, but the length of these courses has 
grown to SIX weeks, as the scope of the work has developed. Not 

the^R^A situations in 

but features other than 
test admimstrahon have been added. Thus while the prime aim 
f me courses has always been to bring testers to a high level of 

tests under standard con- 
tipns. It has also been found profitable to give a grounding in 
construction research planning. As a result of 
tbs the limited resources of the scientific staff have been saved 
from a iMss of seim-technical effort which could never have been 

cSnror assistants. In addition the existence of a 

cops of mtelhgent and enthusiastic testers has been of inestimable 

“tas 

•'“'‘S''' question of course 

content and syllabus structure. Instructional periods are divided 
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almost equally into practical (i.e. exercises in the giving of patters, 
demonstrations by wire recorder, marking, practice, etc.) and 
theoretical (mainly but not entirely lectures). Lectures include 
three main subjects, principles of mental measurement, statistics 
(where each lecture is followed by an hour of practice) and R.A.F. 
selection situations. About three-quarters of all the instruction 
is given'by the school staff (personnel selection officers), the 
remainder being covered by visiting lecturers,from Science 4. 
During the last week a passing-out examination takes place, the 
main pniphasiR being on practical testing with objective tests on 
testing theory and statistics. 


Air Crew Selection 


The entire choice of air crew (medical issues apart) was entrusted 
at the beginning of the war to boards of serving officers who were 
afforded no more definite briefing than to find the right types. 


Three assumptions can be detected behind this: (a) that there are 
right and ■wrong,types for air crew duties, (b) that su^,typ,es. can 
be'aefected by simple interview, (c) that officers who have tbem- 
selveTEe'en ^r crew are particularly suited to do the detecting. 


more important is it to appreciate the complexity of what the 
boards were asked to do. First, they were expected to decide not 
merely who should be accepted as air crew and who should not, 
but also to which air crew category an accepted recruit should be 
sent for training. Secondly, the boards were given no guidance 
as to the relative importance of personality and skill factors in 
, reaching these decisions, and thirdly, they were provided with no 
technical aids with which to measure either. The introduction of 
the Bartlett tests marked a small but significant step towards the 
solution of the last issue. Whatever intelligence, educational 
attainment and sensori-motor co-ordination may be, they clearly 
fall outside what are ordinarily considered personal qualities. This 
distinction once made, the path was to some extent cleared for the 
separatio n of skil l assessment from the appr ais al o f character. This 

I sejaration was carried muclr'ftSffli^^ introduction oT the 
pil ot “era ^gll4estinJ P42, sm dwas completed when the Air Crew 
A ptitude Test Battery was brou ght Iff during''the spring of 19447'“ 
From that .poinirthe-job oftEe TelecTfioS".;l50afdr’ evaluate 
personal suitability for ir crew, the allocation to cafegories‘being 
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undertaken at a later stage and being determined by aptitude and 
preference*. 

Towards the end of 1941 the decision to move flying training 
overseas focused attention on the selection of pilots. This was the 
first concrete problem Training Research were called on to attack 
and their initial analysis showed some alarming figures. It was 
found that of many hundreds who had commenced pilot training 
two years earlier only 41 per cent, had entered on operations, 
while 36 per cent, had failed to qualify. Moreover, two-thirds of 
these failures were falling short at the elementary flying stage which, 
in other words, was acting as a selection screen so far as flying 
ability was concerned. This was uneconomical enough when train¬ 
ing was in this country, but it would clearly be insupportable with 
schools spread over half the globe. 

The problem was'far too urgent to allow of the evolution of a 
test battery even had such a battery seemed the most likely way 
to reduce wastage. It is, however, extremely questionable whether 
such an approach could ever provide as good an answer as the work 
sample method actually adopted. This method entailed a short 
period (12 hours) of flying training with standard tests at pre¬ 
scribed mtervals (originally at 7 and 11 hours, later at Bi, 8i and 
Hi hours). 

It will be seen that such a selection system assumes a close 
correlation between speed of learning and the ability subsequently 
shown. This assumption was firmly substantiated before the flight 
tests were brought in, and the whole justification for their use is 
based upon it. The evidence is given in Chapter XVI, together 
with details of test content, weighting of items, scoring and valida¬ 
tion, and other technical points arising from the need to operate a 
ve^ complicated type of test in a number of flying schools simul- 
tmeously (cf. also Parry, 1947). As the word grading implies, the 
job of each of these schools was not to pass or fail the individuals 
it tested, but to array them in order of merit so that the cream of 
each intake could be skimme^ by a central authority according to 
the needs of the moment; To utilise the tests to the full it was 
obviously necessary to grade considerably more than would be 


tion J cleavage. If intelligence and educa- 

lart in ^e A C s R aptitudes then aptitude did play some small 

procedure. It is, however, true that temperament and 
^Macter assessment played no part at all in category classification, nor is anv 
evidence to hand to suggest that they should have done. ^ 
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required, to enter training, but, this margin conceded, the scheme 
was as well suited to picking the best 20 per cent, as the best 60 
per cent. 

Early in 1942 gross pilot training wastage had been approxi¬ 
mately 48 per cent.; with the introduction of grading the figure 
dropped to nearly half (26 per cent.) The most important reduction 
was very naturally at the elementary stage where the gross com¬ 
parative percentages were 30 and 14, but a perceptible shrinkage 
(11 to 9) resulted at the next stage (service flying), while operational 
training wastage fell from 7 per cent, to 2 per cent. 

The introduction of grading inevitably affected the functioning 
of A.C.S.B.s which hitherto had decided the air crew categories 
each accepted applicant should enter. It was part and parcel of the 
new system that a high proportion of those sent to grading should 
not become pilots, and consequently re-classification of these to 
some other category had to be provided for. To meet this the 
boards early in 1942 were instructed to classify to the pilot, navi¬ 
gator and bomber categories jointly instead of severally. This 
meant tiiat all deemed suitable for any one of tlxese categories were 
given the opportunity to go to grading, and as about 90 per cent, 
of applicants were anxious to do this the field for the final selection 
of pilot trainees was greatly increased. The scheme, however, did 
nothing to aid the selection of the other categories and the need 
for a further selection phase which would p ermit of the assessment of 
navigator, gunner, etc. skills, became apparent eighteen months later. 

The primary purpose of aptitude testing was to secure skill 
measures for air crew accepted by the selection boards in respect 
of each of the six main categories. This led to the preparation of a 
two-day testing programme with twenty-four tests yielding six 
sub-batteries. No experimental period was available either to 
devise new' tests or validate tests loaned by other services*. A 
posteriori validation was, of course, planned, and has been carried 
out so far as continuity of training has allowed—a summary of the 
findings appears in Chapter XVI, together with a detailed account 
of the way the problem was handled. Here it need only be added 
that weight was given in classification to the recruits’ preferences 
wherever skill as measured by the tests was adequate and the needs 
of the Service permitted acceptance. 

* Tribute must be paid to the U.S.A.A.F. in particular, who loaned a con¬ 
siderable number of their paper tests. 
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From Ap ri l, 1944 then, scientific measures have been in use for 
the assessment of air crew skills. But the problem of personality 
assessment remained, and though this has been attacked dur in g 
the l^'tiivo years TSinch'still remains to be done. A job analysis 
based'on ^e opinions of $60 air crew officers has at least shown 
Jclose agreement about the personal qualities held important in air 
/crew, and incidentally suggests that the needs are much the same 
f from category to category. It also shown that the important 
qupities are for the most part conspicuously hard to rate at inter¬ 
view, whilethose tfiat are easily assessable (diction, appearance and 
the like) are reckoned of very little account. All this makes it a 
difficult matter to supersede the unstandardised service interview 
by a more reliable instrument. But the old ^procedure has been 
stren^ened by the insertion of a pre-interview at which the 
qualities deemed important receive independent rating. Table XLV 
(p. 277) shows that the reliability of trait assessments is on the 
whole satisfactory. In addition a biographical inventory is being 
prepared which is expected to throw more light on this question. 

Though the separation of skill and personality assessment has 
been a logical development, their temporal and geographical dis¬ 
placement has been an accident of circumstances. A fully deve¬ 
loped selection system works not by a series of successive hurdles, 
but prqvides for a weighing one against the other of at least the 
most important factors. The R.A.F. have now taken the firststep 
towards this by placing air crew selection boards and aptitude 
tes&g at the same centre; The next technical problem is to find a 
reliable way of evaluating personality assessment so that this can be 
related to aptitude scores. 

To complete the picture of air crew selection two further features 
should be touched on: 

L Advantage is now taken of the first stage of all—combined 
recruiting centre—^to eliminate the more hopeless applicants. 
This is done by means of a fully controlled interview designed 
to anticipate selection board verdicts. It has been found poss¬ 
ible to anticipate in respect of the worst 5 per cent, to 10 per 
cent, with almost 100 per cent, accuracy. Technically this is 
purely a screening interview, and is of interest because it has 
a precise and limited objective and has been proved to work 
effectively in the hands of interviewers with very little training. 

2, It has' been claimed by the R.C.A.F. that the Link trainer 
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can be adapted as a selection test to yidd results comparable 
in accuracy to grading. This, if true, would result in a great 
ewnpiny of mbhey, time and effort. An experiment to test 
the claim was planned in April, 1946, but large scale cessation 
of flying training has resulted in a mere trickle of validatory 
evidence. The few cases to hand suggest that Link adds 
I app re ciablv-to aptitude testing; but it has not yet been con- 
I firm e d that it adds as much as grading. There is also a very 
reaT danger of candidates getting access to a Link before 
coming up for selection, and thereby securing a great 
advantage. 

It will perhaps be well to sununarise this rather complex section 
with a table showing the successive air crew selection phases:* > 


Name of Phase 

Function 

Selection 

Instrument 

Action 

1. Combined Re¬ 
cruiting Centro 

Loose Personal¬ 
ity Screen 

Controlled 

Interview 

Elimination of 
lowest 10-26% 

2. Combined Selec- 
tioh Centre 

A. Pull Per¬ 
sonality Assess¬ 
ment 

Pre-interview 

Technique 

Results of A and 

B _ weighed by 

Final Board with 
power to inter¬ 
view further 


B, Differential 
skill Assess¬ 
ments > 

2-Day Testing 
Programme 

/ 


(A and B prior to mid-lB47 constituted successive hurdles) 


2, Grading. 

1 

Assessment of 

Standard 

Better pupils to 

(Potendal 

slull in the air ' 

Flight Tests 

Pilot training. 

Pilots only) 

1 


after 64, SJ 
and 114 hours 

Remainder to other 
categories found 
suitable in 2B 


Selection of Aircraft Apprentices, Administrative Apprentices 
and Boy Entrants 

The main selection situations in the R.A.F., outside air crew, 
are ground trades, apprentices and boy entrants, and ofEcers. In 
the ground trades the procedure instituted by Stephenson (p. 70) 
has so far received no fpndamental re-direction, though aditions 
and modifications have been made to it. Further information as to 
its working and its effectiveness is supplied in Chapter XVI. 

In 1939 apprentices and boy entrants were selected ex'cluaively 
on educational attainment. Important though this factor is it 

* Grading has now been discontinued for administrative reasons, pilot skill 
being assessed by aptitude tests alone. 
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cannot be relied on to yield consistent indices of technical ability, 
and when these entries were re-opened after the war the need for 
an amplified selection procedure was generally recognised. The 
current programme for aircraft apprentices is in two phases. The 
first,* carried out in a number of local centres, consists of a qualify¬ 
ing educational examination as a result of which a small proportion 
(under 10 per cent.) of applicants are eliminated. At the same time 
a general intelligence test is taken. All who survive this stage are 
then called to the R.A.F. selection centre where they undergo 
medical examination, take a one-and-a-half day programme of 
aptitude tests, f and are given a two-stage interview. Their final 
acceptance or rejection is determined by three lines of evidence— 
education^ in^^ and technical aptitude, brought together 

aiS interpreted by a trained interviewer. The tests are additionally ^ 
i used to give pointers to trade allocation, particularly as between the 
mechanical and electrical trades, although every effort is made to 
evaluate the strength of trade preference and to concede first 
choices when possible. The tests in use have been selected from 
a large battery administered to the first post-war intake and pro¬ 
visionally validated against early training results. Full validation 
will be spread over a number of intakes and will take all stages of 
the four-year training into account. 

There is no preliminary qualifying examination for adminis¬ 
trative apprentices and boy entrants. The first group' are called 
to the selection centre where they take tests in educational attain¬ 
ment, general intelligence and clerical aptitude. A level of general 
ability at least as high as that for aircraft apprentices is required. 

The two apprentice populations constitute the technical nucleus 
of the R.A.F. and between them provide a large proportion of the 
personnel in the high-grade trades. The information arrived at in 
developing their respective test batteries is consequently directly 
I relevant to the wider task of building up a general ground trade 
test battery. Further light is expected from the selection of boy 
entrants who are required for feeder trades. 

Administrative apprentices and boy entrants are, like aircraft 
apprentices, interviewed during the selection stage, their trade 
choices being carefully explored and effective contact made with 
every applicant. 

* This stage of selection is handled by the Pirectorate of Educational Services. 

f A eonsid,erable debt is owed to the N.I.I.P. for their generous advice and 
loan of tests. 
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Throughout the war the R.A.F. relied on the interview for its 
choice-of officers, but in 1946 it was decided to adopt W.O.S.B. 
methods. A small preliminary experiment (December, 1043) in 
which R,A.F. candidates for commissions were put through both 
selection procedures had pointed towards the new technique as the 
better predictor of O.C.T.U. results. In June, 1946, an R.A.F. 
Selection Board with Air Commodore as president was established 
to choose the first post-war entry of cadets for the R.A.F. College, 
Cranwell. Board personnel had the advantage of observation and 
training at Army centres. 

During the following autumn the same board was called on to 
evaluate the claims of serving officers who had applied for per¬ 
manent commissions, and since then it has been in almost continual 
session, handlin g in turn both types of entry. 

It is early to look for validation evidence in respect of either 
the cadet entry or the serving officer population. Indeed, in view 
of the high rejection rate it is likely to be extremely hard to 
evaluate follow-up results, particularly in the case of the cadets 
where the great proportion of failures are lost to the Service 
completely. 

Other Aspects of Air Force Psychology 

The main types of problems, outside selection, with which 
Science 4 has been concerned are training assessments, instruc¬ 
tional method, and morale. The first of these has received a fair 
share of attention, the second has been studied only in respect of 
isolated problems, while the third is claiming consideration now 
for the first time. 

Training Assessment ,—The general problem is the introduction 
of objective methods in respect of practical as well as written tasks. 
The urureliability of traditional methods has been demonstrated 
over and over again and hardly needs emphasis here. Two illustra¬ 
tions will suffice: Variations in the use of a common scale have been 
shoym to be so great from one school to another that the pooling 
of marks from different schools often leads to the most grotesque 
results (this, of course, is compatible with objective orders of merit 
within each school). Secondly, scarcely any agreement has been 
found, in many instances, between the marks obtained by indi¬ 
viduals at similarly named subjects taught in successive stages of 
training. 
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The objectification of written examinations presents no special 
difficulty in R,A.F. technical trades. The best procedure involves 
the summoning of a small panel of instructors who under the guid¬ 
ance of a psychologist elaborate a pool of new-type items. From 
this it is usually possible to extract two or three alternative versions 
each with eighty to a hundred questions. 

The difficulties with practical tasks and above all aerial exercises, 
e.g. pilot skill, are very much greater and a long experimental 
period is customary before even moderately trustworthy tests can 
be attained. The phase check or standard efficiency test (p. 267) 
which entails the application of the objective method to learned 
drill sequences has been adopted in many settings with satis¬ 
factory results. 

Two by-products of the whole training assessment problem are 
the supplying of reliable criteria against which to validate selection 
tests and the application of up-to-date cori'cctions to syllabus con¬ 
tent. For the pooling of objective items forces attention on the 
emphasis given to different areas and in some cases on the survival 
of features which no longer need be taught. The danger of accept¬ 
ing a neatly-made assessment scheme as evidence of sound assess¬ 
ing has been demonstrated again and again, as also the danger of 
assuming that instruction and assessment should necessarily be 
undertaken by the same people. 

Instructional Methods .—^Apart from a few isolated experiments 
(mostly in the field of morse training) very little has been under¬ 
taken here. This is not for any lack of problems. On the contrary 
the teaching syllabuses of most technical trades have become so 
swollen and overloaded that there is a crying need for basic investi¬ 
gation. What is required ideally is the attachment of a qualified 
psychologist to each of the main R.A.F. schools. Something much 
more fundamental than piecemeal patching and shuffling is called 
for. Controlled experiments into such things as length of working 
day, relation of theory and practice, training sequence, classroom 
conditions, use of demonstration aids, use of notes and selection 
of instructors all need to be investigated. 

Morale. —^Peace-time morale, particularly with the conscription 
issue to reckon with, is a far more difficult matter than morale in war. 
The first step is , the . systematic evaluation of opinions on such 
t9pics as general sources of discpntent, accident, prevention-and 
attitude 6F the conscript to the Service. An attempt is now being 
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made in these directions which make their appeal very naturally 
to the social psychologist rather than the psychometrist. While 
evaluation is in progress the Service psychologist can do much by 
making the Service mind aware that these problems exist and have 
to be met. It is usual and no doubt natural that these efforts should 
often be met with resistance. The view that people will always 
grouse, or that the conscript is bound to resent his period of 
service, is frequently heard, and needs to be combated. 



PART II 


The principles of peraonnel 
selection and guidance which 
have evolved both from pre¬ 
war investigations and from 
war-time experience. 




CHAPTER VI 


GENERAL PRINCIPLES OF VOCATIONAL 
CLASSIFICATION 

Abstract .—^Personnel work in the Services consisted of a com¬ 
bination of selection—^in the sense of picking the best men for a 
job, and of guidance—in the sense of recommending the job 
which would best suit each man. This conception, which we call 
vocational classification, is worth pursuing in peace-time education 
and industry. The following are'the main elements of classification 
procedures: ' 

G) Proyisioti of .Mortnation about jobs to candidates helps to 
promote self-guidance. The staff responsible for classifi¬ 
cation must also be provided yith infonnation in a usable 
forai^Tiut do, not usually need detailed job analyses. 

(2) Collection of relevant data about adult candidates is mainly 
done by biographical questionnaire and interview. But cumu¬ 
lative pupil record cards are more appropriate with children. 

(^ Educational and vocational classification should take account 
not merely of the candidates’ abilities, interests and person- 
aliHes, but also of sociological factors, in particular, attitudes 
to^'d’s'education or to employment. 

(4) Tests and measurements are important, but their role in 
classification should not be exaggerated. 

(6) Interviewing has several vital functions and, in spite of its 
weaknesses, cannot be dispensed with. 

Classification should not be based on a single "cross- 
tioir^f’tHelndiVidaali'but should be linked as a con- 
tin uous process with education or training. 

The main feaErVof tKe sSenfificis validation, or 
continual checking of the value of the procedures employed. 

(8) Fully qualified psychologists can seldom undertake classi¬ 
fication themselves, but should give thorough training to 
suitable persons in education, industry, etc., and should 
supervise their work. 
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The wrork of psychologists ia the Forces which we have been 
describing can hardly be called occupational selection or vocational 
guidance. It is a blend of both, which we will call. vpcatipnal 
ckssifl caHon. The term is not an ideal one since it suggests the 
forcing orrecruits into jobs which they may not approve; whereas 
the essential object is to adjust or harmonise the whole range of 
recruits (each possessing his or her individual capacities and 
interests) with ^e range of Service employments (each having its 
particular requirements). 

Occupational selection has “in the past often been a somewhat 
anarchical procedure whereby an industry attempted to grab the 
best available workers, and to disregard the wider interests of the 
community. It will be shown later that, provided the supply of jrugn 
for a jo b greatly exceeds the cienamd, the quality of employees can 
be^ raised by almost my selection £rqcedure^,^whether. it ^be^ 
interview by an untrained foreman or a few tests whose, yali^ity 
is jery Nji u". i,'<■ '..'CWC vb'' pi ocedprf„the..,gr.eateK. the 
improvement in quality, but even an inefficient procedure will 
appreciably'tUf 'db^n the'iiumh^^ of unsatisfactory empiQy!6,e§, so 
lon^as" no"account is’taken of the candidates wrongty rejected 
(cfTprO^^^Ji Tf^ however, we'pay attention hot only to the can¬ 
didates wrongly accepted, i.e. those who pass the selection proce- 
(iure and turn out unsatisfactorily, but also those rejected by the 
procedure who could actually have managed the job, the procedure 
must attain a much higher validity. Again, if the supply does not 
greatly exceed the demand, and the so-called selection ratio (the 
number to be employed divided by the number of candidates) is 
high, it is far more difficult to dhoose the best men for the job, 
(cf. Tiffin, 1946). 

In 1941, when psychologists entered the Forces, anarchic con¬ 
ceptions were rife. Certain branches of the Navy, Armyj and Air 
Force were using their own interview or other procedures to grab 
high-quality recruits, entirely disregarding both the needs of other 
less powerful or less attractive branches, and the interests of the 
thousands of men they rejected who would have been perfectly 
adequate for the work. Some branches; as we have seen, even 
enlisted the aid of university psychologists to improve their pro¬ 
cedures. The results were quite gratifying to the branch concerned 
and to the psychologists, but actually produced greater vocational 
maladjustment in the Forces as a whole. The same conception was 
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nrevaleilt in education until the 1944 Act. “Public” and Grammar 
Schools could afford to set stiff entrance and scholarship examina¬ 
tions whose validity in picking the pupils best fitted for seconds^ 
education was dubious, because of the low selection ratio, i.e. the 
excess of candidates over available places. It_,did not matter to 
them that t housands of pupils were rejected who would have 
de^cT^S'Wch (or greater) benefit from advanced education as 

Vo^ional guidance, on the other hand, has tended to regard, 
the interests and capabilities of the individual as paramount, and 
has recommended the job or types of job best suited to him. While 
N.LI.P' psychologists have always taken into account the oppor¬ 
tunities that each individual would have for following the 
recommended career, they_were,n§t}ji:a.Uy.unable to match t^ir 
recommendations closely with the state of the labour market. Their 
systenritTorher words, works best when demand exceeds supply, 
so that^lnifividual can readily pick and choose among a wide 
range of available jobs; whereas vocational selection prefers the 
opposite state of affairs. Other difficulties with the guidance given 
by the National Institute to individual applicants are its length 
and expense. When each candidate consumes the best part of a 
day’s work by a highly trained investigator, the fee cannot be 
reduced much below £4, which puts it out of reach of the majority 
of the population. Such guidance, moreover, is largely restricted 
to the higher occupational levels such as the professions, and amuch 
simpler procedure might be adequate for less skilled work in. which 
individual aptitude' and interest in theworkitself are less important. 

Very early then in the history of the Institute, attempts were 
maHp to extend the procedures more widely. In a series of experi¬ 
ments in London, Birmingham and elsewhere, guidance was given 
to large groups of school-leavers who were representative of the 
general population. Outstanding features of tffis work were its 
comparative speed and simplicity, its application to all occupa¬ 
tional grades, its validation by subsequent follow-up,-, and its 
integration with a survey of the labour market in the particular 
districts where it was carried out. Thus, in the most recent 
Birmingham experiment, some 1,600 pupils were psychologically 
assessed by 74 specially trained teachers, and were placed in jobs 
which accorded as closely as possible with the assessments, by 10 
juvenile employment officers (cf. Hunt and Smith, 1946). Rodger 
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(1939a) suggests that large-scale vocational classification of school- 
leavers can best be tackled by such co-operation between the 
schools and the Employment Exchanges. Similar schemes have 
been attempted elsewhere, for example, by Macdonald (1938) in 
Edinburgh and Meiklejohn (1945) in Ayrshire. 

In the Forces, as we have seen, it was essential to find suitable 
employment for all recruits and to consider the requirements of all 
the branches. The auDDlv .flf.gflfld mea -usually fell far shortuf. the 
F ernands, and w xong refections by inefficient selection procedures 
mad agi efiects as wrong acceptances. Vocational guidance, 

in the sense of recommending for each recruit the job he could do 
best, was equally inappropriate, since only a limited range of 
employments was available, each requiring certain definite quotas 
of men or women. Thus it was impossible to suit the interests, 
experience and aptitudes of every individual. To take a naval 
example: in 1942-3 most of the mechanic branches were crying 
out for men, and large numbers of recruits with no engineering 
experience had to be trained for such jobs. But by 1946 the demand 
dropped so greatly that many recruits with good mechanical ability 
and training had to become seamen, writers (clerks), and so forth. 
Even if Admiralty psychologists had possessed a perfect test which 
showed that, say, 20 per cent, of recruits had good mechanical 
ability, they would have had to disregard its findings when the 
Navy wanted either 30 per cent, or 10 per cent, of recruits for 
mechanic branches. While fluctuations may be less violent in 
civilian life, here too there is no guarantee that the distribution of 
abilities arid interests of school-leavers in any one area will match 
the distribution of job requirements. And even in the largest 
industrial concern the range of jobs open to employees about to be 
allocated or transferred is obviously still more restricted. The 
present lack of candidates for mining, cotton, brick-making and 
other industries shows how impossible it is to obtain correspon¬ 
dence. A good illustration-was provided by the Dundee vocational 
guidance investigation (Pallister, 1938), when it was found that 
roughly 16 per cent, of the girls leaving scliool would be absorbed 
by each of the main employments—^factory work, jute mill, shop 
assistant, and office work. But when a questioimaire was given to 
typical pupils, 30 per cent, wanted shop assistant, 27 per cent, 
factory, 20 per cent, office, and 4 per cent, mill. Clearly a large 
proportion of girls were certain to be disappointed. 
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The conception of classification workeci out in the Forces has 
important civilian applications during the post-war years when 
Britain’s economic recovery largely depends on the efficient utilisa¬ 
tion of the available man- and woman-power. It is implicit, too, 
in the new approach to education. The desire to provide the 
amount and the kind of education that accords with each child’s 
talents and interests is praiseworthy, even though, as Burt (1943a) 
shows, the notion of classifying pupils at the age of 11-12 into 
academic, technical and modern “types” is psychologically un¬ 
sound.* As mentioned in Chapter I, most American high schools 
and colleges have educational counsellors who are trained in the 
application of modern psychological techniques of testing and 
interviewing, and who are likely, therefore, to be able to guide the 
educational careers of the pupils and students more effectively 
than, say, an over-worked head master. In industry also the con¬ 
ception o f class ification is winning wider recognition. A firm such 
as Rowntrees at York absorbs every year a certain proportion of the 
school-leaving population of the district, and uses its personnel 
selection procedures not so much to pick the cream as to sort the 
available employees according to the various openings within the 
factory for which they will be most suited (cf. Northcott, 1946). 
Tiffin (1946), an American industrial psychologist, similarly points 
out that scientific vocational methods do not lose their value when 
labour is in short supply, since it is all the more important to use 
them for transfers and promotions within any one firm, in order 
to improve the quality of employees assigned to each job, and to 
improve morale by reducing the number of misfits. 

What are the main principles underlying large-scale classifi¬ 
cation procedures as worked out in the Forces, which can also 
be applied in modern industry and education ? They may be listed 
as follows: 

^^1) Provision of information to candidates and to those respon¬ 
sible for the classification. 


Collection of relevant data about the candidates by question¬ 
naires, cumulative records, etc. 


v(3) Due consideration of sociological factors. 

^ (4) Application of standardised tests of aptitudes and abilities. 
^'(6) Competent interviewing. 

(6) Linking of the selection or guidance process with education 
and training. ' 
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Jl) Validation and follow-up of all procedures. 

'^(8) Training and supervision of personnel who carry out the 
scheme. 

Provision of Information 

Macrae (1932), Oakley (1937) and others have pointed out that 
one of the most potent sources of vocational maladjustment is 
ignorance or distorted information as to the nature of various t 3 rpe 8 
of job. Experience in the Navy in particular shows that most 
people are capable of a large measure of self-guidance provided 
that they have access to simple but accurate descriptions of the 
work and all that it involves; and that a large amount of the 
psychologist’s time can be saved if they have received and con¬ 
sidered this information before seeing him. While direct vocational 
education is seldom if ever advocated nowadays, there is every¬ 
thing to be said for more education about vocations. In enlightened 
primary schools projects are often devised,,which lead the pupils 
to acquire English and arit hm etical sldlls, and these may be centred 
round, say, the post office, the railways, ships, house-building, or 
a local industry. As the leaving age draws nearer, more compre¬ 
hensive surveys of employments can be provided by films, books 
and painphlets, visits to works, and, as Rodger (1944) suggests, 
‘‘Brains Trust” discussions. Talks by employers are, of course, 
likely to over-emphasise the attractions of certain jobs, but talks 
by young or by mature employees, e.g. former pupils, may be 
eirlightening. The value of giving vocational information to 
beginners in any large industry, by films, talks, and works tours, 
is widely recognised, since it not only helps in the initial process of 
allocation, or in the re-allocation of employees who prove unsuit¬ 
able, but also stimulates each worker’s interest in his particular 
job when he realises how it fits into the whole scheme of production. 

Still more necessary is it for the person responsible for guidance 
or selection to be familiar with job requirements, and to be able 
to clear up misconceptions in the minds of candidates. Teachers 
who assess and advise school-leavers cannot be expected to' be 
industrial experts, though any knowledge they can acquire and 
use, or pass on, is better than none. They should for example, 
study Oakley’s Handbook (1937), the Ministry of Labour’s pam¬ 
phlets on careers, and other such Uterature. But, just as in the 
Services, so in civilian life the state of the labour market varies, and 
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; of many jobs may alter radically. Hence it should be the 
of Labour’s function to survey the openings and to give 
up-to-date information to the schools. Rodger (1939a) 
hat this should be done under the same headings as the 
r psychologist uses in assessing the qualifications of the 
s, for example: 

^sical and medical requirements, 
jcational qualifications, 
rel of general intelligence, 
jcial aptitudes, 
erests. 

iposition or temperamental qualities, 
ler relevant circumstances (training, apprenticeship, etc.). 
)t possible to enlarge here on scientific job analysis pro¬ 
of. Bingham and Freyd, 1926; Viteles, 1932), but the 
iderata should be mentioned. Neither superficial obser- 
the work by a psychologist, nor the description given by 
yer or foreman are adequate. The experience of men who 
;ssful at the work and of failures should be critically 
; and integrated, and special note made of the commoner 
failure. The analysis should avoid the use of vague traits 
ties (persistence, concentration, dexterity and the like), 
Id present" as objective as possible a description of the 
ities or operations performed. At the same time it is 
seful to list in detail the specific movements, etc., or to 
time and motion study analyses. Rodger’s list of points 
cpanded to include information about employment con- 
ihysical and social), special advantages and disadvantages, 
other incentives, promotion and prospects, length and 
: training, relation to other jobs in the organisation, and 
experience that may be useful or necessary. 

Collection of Data ^ 

B. Wilson estimates that at least half of the information 
L vocational decisions are based should be derived from 
)ry, the other half consisting of tests and interview judg- 
present abilities and other qualities. Hence, the enquiry 
past must be very carefully planned, 
count is given in Chapter VIII of the biographical 
laires adopted in the Forces, and of the value of the 
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information which they yielded. On these are collected not only 
details of .the recruits’ background, but their test scores and the 
interviewer’s findin gs and recommendations. Thus they'constitute 
the keystone of the whole classification procedure. They are never¬ 
theless litnited both by the accuracy of the recruits’ memory, and' 
by the time factor. They cannot contain more questions than the 
dullest recruit can understand and answer in half to three-quarters 
of an hour. 

Vocational work in schools has the great advantage that it does 
not have to be carried out in a few hours or days. Relevant features 
of children’s career's can be observed and recorded at the time, 
instead of having to be recalled later. Thus, the cumulative record 
card should be made the focus of classification procedures among 
children. (In an industrial firm, an analogous record card is 
obviously of the greatest assistance to the personnel official who 
deals with transfers and promotions.) 'While most education 
authorities now maintain pupil record cards, an investigation by 
the N.I.I.P. (Cockett, 1946) has shown how unsatisfactory are most 
of these for vocational purposes. Their emphasis tends to be on the 
weaknesses of pupils, whereas the careers master is mainly concerned 
with potentialities and strengths. Usually they are so incomplete 
that he willneed to design a supplementary form or else adopt a care¬ 
fully constructed card such as that of the Institute, which covers: 

(1) Records of scholastic attainments over several years. 

(2) Medical data. 

(3) Positions held in school, games, etc. 

(4) Family backgroimd, out-of-school employment. 

(6) Interests and hobbies. 

(0) Ratings of traits of temperament and character. 

(7) Objective test results. 

(8) Occupational suggestions. 

Haniley’s (1937) and Fleming’s (1946) suggestions are also worth' 
consulting. Such a document is preferably filled in by the careers 
master or other specially trained member of the staff on the basis 
of discussions with the children’s teachers, rather than by the 
various teachers themselves. 

Sociological Factors 

So numerous and so complex have been the developments in 
psychological and statistical techniques in recent years that there 
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is often a tendency to neglect the wider sociological setting of 
problems in applied psychology. As Burt (1947) has described, 
selection for secondary education is often discussed purely in terms 
of the children’s abilities as measured by examinations or teachers’ 
gradings, intelligence or other psychological tests. It is not enough 
to add assessments of personality traits such as industriousness, or 
of interests in different types of curricula. In many cases the main 
factor in success or failure at the secondary school is the attitude 
towards schooling current in the social group from which the chil¬ 
dren come, and particularly in their families. Similarly, estimates 
have been published of the proportion of the population likely to 
possess the ability needed for university education, apparently 
ignoring completely the fact that going or not going to a university 
is largely determined by the attitudes of the social and economic 
classes to which prospective students belong. When some new 
technique, say intelligence tests or vocational guidance, works well 
in one context, over-enthusiastic advocates apply it indiscrimin¬ 
ately, not realising that it may require considerable modifications 
to adapt it to the mentalities of the individuals, or to fit it into the 
framework of the social institutions with which they are concerned. 
We have mentioned in earlier chapters some of the conflicts that 
occurred between the technicians and Service authorities. Un¬ 
doubtedly these would have been far more serious but for the con¬ 
stant endeavour of the psychologists to adjust their procedures to 
Navy and Army traditions, customs and standards. Similarly the 
educational or vocational psychologist should avoid planning any 
scheme of selection or guidance in the abstract, and should make 
himself thoroughly acquainted with the educational or industrial 
system within which he is working. Burt deplores the absence of 
sociology from the training of most educational psychologists, and 
points out that the provision of psychiatric social workers in many 
child guidance clinics is an inadequate substitute, since their job 
is to note abnormal factors in the home situation rather than the 
normal mores of the society in which the children are reared. 

To the Tavistock Clinic Group (cf. p. 64), it is not merely 
sociological but “sociatric” factors which are of prime importance, 
that is the underlying—often unconscious—emotional forces with¬ 
in the organisation. It is well known, for example, that the griev¬ 
ances consciously expressed by employees in an industrial firm, say 
about the canteen or about wages, are often projections of, or 
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unwitting substitutes for, deeper dissatisfactions, with the autocratic 
attitudes of the management. Similarly a request from a firm for 
the institiltion of scientific selection methods may be a defence 
reaction, an unconscious desire to ward off enquiries into more 
serious maladjustments. If the psychologist meets such a request 
by devising selection tests, or if welfare officers improve the can¬ 
teen arrangements, these fragmentary solutions may actually 
exacerbate the trouble and lead to the appeai-ance of fresh symp¬ 
toms. Should the psychologist suggest the real source of friction 
and inefficiency, he is apt to arouse hostility, or to be side-tracked 
on to dealing with trivial details. Hence his aim, like that of the 
psychotherapist who is treating a neurotic patient, should be to 
help the institution to understand its own deeper motives and to 
achieve a more harmonious integration. 

While the present writers are sympathetic with such psycho¬ 
analytic explanations of the maladjustments both of individuals 
and of social groups or institutions, they would stress the sub¬ 
jectivity of these approaches. Just as different analysts diagnose the 
saime patient differently and explain his symptoms (and often 
ameliorate them) in terms of diverse theories of human motivation, 
so different “sociatrists” would be liable to analyse a firm’s or an 
educational system’s difficulties differently. Obviously, too, very 
few persons could be trusted to handle such investigations. One 
shrinks from the prospect of a newly-fledged educational psycho¬ 
logist lecturing his education committee on their unconscious 
mechanisms which stand in the way of an efficient selection scheme. 
Nevertheless, some training in abnormal as well as normal 
sociology should help him, or any other applied psychologist, to 
realise why scientific selection and guidance schemes are not always 
immediately acceptablfe or practicable, to feel his way tactfully and 
gradually and to bear with his frustrations. 

Tests 

The principles of testing are discussed in later chapters. Here it 
is desirable to combat the exaggerated r 61 e assigned to tests 
of abilities by many writers, particularly in America. Vocational 
classification is often described as putting “square pegs” iiito 
“square holes,” as though it consisted merely in matching mechani¬ 
cally the measured characteristics of the candidates with the 
appropriate “profile” of job requirements. It is a misleading over- 
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simplification, of the aitjis and methods of vocational, psychologists 
to neglect all the principles-in our.list exceptNos. 4 and 7—^Tests 
ancTVaUdation (only admitting Nos. 2 and 6—Personal History 
Items and Interview Judgments—^in so far as, they can be treated 
as additional “tests”). Indeed, nothing is more likely to bring 
vocational psychology into disrepute, especially if the tests are 
applied and interpreted by not very highly qualified personnel. It 
is gratifying to observe that the plans of American psychologists 
for the rehabilitation and guidance of demobilised recruits assign 
an important but far from pre-eminent function to tests, and 
recognise the crucial value of skilled interviewing, of surveying 
each individual’s past record, and of integrating the guidance pro¬ 
cedure with training (cf. Scott & Lindley, 1046). In other words 
they appear to be accepting the view, generally held by British 
psychologists (cf. Rodger 1939b) that tests should be “servants 
not masters.” 

Ih most jobs such factors as health and physique, interests, 
attitudes and personality qualities, previous experience and train¬ 
ing, and sociological circumstances are more influential than the 
abilities which can readily be tested. Such factors can, it is true, 
sometimes be treated quantitatively or measured by appropriate 
tests (cf. pp. 136-142), but in actual practice they are more often 
elicited by biographical questionnaire and interview, and weighed 
up qualitatively. 

Long and Lawshe (1947) point out that in 1919 psychologists 
were expecting tests to be adopted widely in selection for any and 
every job, but that twenty years later there had been very little 
advance. Only about 7 per cent, of a larg? sample of firms in the 
United States were making any use of tests in 1936 (though there 
^ appeared to be a rise in some smaller surveys in 1940). They attri¬ 
bute this partly to undue enthusiasm on the part of testers, partly 
to undue suspicion among users. In our view the main factors are ; 
(a) The extraordinary difficulties of devising practicable tests and 
of carrying out even preliminary validation, still more of keeping 
such validation up to date and accommodating the test batteries 
to alterations in the nature of the work; and (6) the importance of 
the factors other than abilities, just mentioned. Few if any indus¬ 
trial or educational organisations could afford the numbers of 
highly skilled personnel which would be needed to carry out voca¬ 
tional schemes based wholly on tests. In the 1920’s Hull (1928) 




96 PERSONNEL SELECTION IN THE BRITISH FORCES 

advocated the cons^ction of some 30-40 tests which between 
them would measure the abilities needed in 60 of the main occu¬ 
pations." By mechanical methods an individual's scores on such a 
battery could be appropriately weighted and combined to predict 
his suitability for all these occupations (cf. p. 104). Only recently 
has this project been put into effect by the U.S. Department of 
Labour, Occupational Analysis Division (Dvorak, 1947). Fifteen 
varied tests, taking about two and a quarter hours, provide mea¬ 
surements of ten main ability factors, and ihinimum scores on 
these factors are indicated for twenty groups of common occupa¬ 
tions. While the scheme is highly ingenious, we feel that it appears 
to fflalre vocational classification delusively easy. Moreover, the 
validity of several of the tests, and of the factors based on them, in 
predicting occupational success is likely to be low, and no evidence 
on this point has so far reached us. 

The Interview 

This topic, too, receives fuller treatment below, and we will 
' merely outline its main functions. 

(i) Clarifying and amplifyirig the data given in the question¬ 
naire and assessing its significance; in particular scrutinising 
claims to trade experience. 

(ii) Providing the interviewee with more detailed information 
about any jobs in which he is interested or which appear 
suitable. 

(iii) Surveying his interests and attitude, e.g. towards different 
types of secondary education, to an industrial firm, or to 
the Service. While it is entirely feasible to devise scales for 
measuring any one well-defined interest or attitude, or 
several simultaneously (cf. Vernon, 1938a), yet so great is 
their variety, and so complex their inter-relations in 
different individuals, that the more flexible, even if less 
accurate, interview approach can scarcely be dispensed with. 

(iv) Judgments of temperament and character from facial 
expression, gestures, manner, conversation, and directed 
discussion of past history and future aims. This is the aspect 
of interviewing which is most unreliable, and most likely 
to differ from one interviewer to another. Nevertheless, it 
can be standardised and improved in many ways, and 
in the absence of good tests of personality, is almost 
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the only feasible approach to the assessment of this vital 
factor. 

(v) Synthesising and balancing up all the factors such as experi¬ 
ence, education, abilities, interests, personality and job 
possibilities. Here also there is so much scope for subjective 
judgment that the conclusion^ reached by poor interviewers 
may often be less valid than conclusions based, say, on testa 
alone or on the interviewee’s expressed preferences alone. 
Nevertheless, here also more scientific procedures raise 
almost insuperable technical difficulties! 

(vi) Stimulating the individual’s morale. Even if (as may some¬ 
times be the case) the interview is worse than useless, it 
would still be desirable to include it, at least in Service 
classification procedures, because it is by far the most 
“human” of psychological techniques. It shows the recruit 
that he as a person is receiving consideration, and that the 
Service is trying to take account of his capabilities and 
interests. It would be impossible too to convince the Service 
authorities that either the recruit or the Service itself was 
getting a fair deal if impersonal methods such as tests and 
questionnaires alone were used. Similarly, in industry the 
goodwill towards the firm engendered by sympathetic inter¬ 
viewing is a potent consideration. Finally, tactful persuasion 
by the interviewer is often needed if the candidate is to 
accept willingly the job for which he is most suited, rather 
than the job which he thinks he would like. 

. \ * 

Links with Training or Education 

Both guidance and selection at the present time tend to be bas^d 
to too great an extent on a single “cross-section” of the candidates. 
As pointed out under No. 2, classification procedures among 
school children can be spread over several years. While the ques¬ 
tionnaires used in the Forces attempt to take the past into accoimt, 
there are obvious advantages in the approach advocated by R.A.F. 
psychologists, which regards selection and training as parts of one 
and the same process. The effective placement of a civilian recruit 
in a Service job involves both his initial allocation and thie success- 
ffil completion of his training (or his re-allocation and re-training). 
During training his competence for the job is examined, just as it 
is in selection testing; moreover, each completed stage of training 
p.s .—4 
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is considered to be predictive of suitability for the next stage, just 
as is the psychological examination. 

Vocational classification, in the Services was handicapped by the 
weakness of the links between the psychological, the training and 
the education branches. Apart from the fact that they did not 
always see eye to eye, the effective integration of their aims and 
techniques was rendered impossible by the shortage of psycho¬ 
logists. The result was that innumerable problems of mutual 
interest failed to receive the scientific investigation they deserved. 
Some training courses were largely inappropriate; training devices 
and visual aids were employed without any proof of their value; 
old-fashioned examination methods were used which not only 
probably failed the wrong men, but also provided thoroughly 
unsatisfactory criteria against which to gauge the validity of selec¬ 
tion procedures. Again, more attention might have been paid to 
the psychologists’ view that, as only a limited supply of highly 
intelligent or well-qualified men was available, certain training 
courses should be simplified and lengthened for the benefit of the 
poorer-quality trainees. The naval “transfer” scheme and the 
Army A.S.C.8, described above, did help as it were to plug some 
of the leaks, but the most effective schemes of selection-cum- 
tralning in the Services were those developed for naval ofiicer 
candidates, and for R.A.F. air crew. 

It is appropriate here to point out the close intef-connection 
between selection, training, and factors of design and layout in the 
job itself. When, as so often happens, the work involves needlessly 
complex perceptual and niuscular activities, selection is rendered 
more difficult and training lengthened. Conversely, the psycho¬ 
logist can often simplify his selection problems by modifying the 
equipment in such a way that a much larger proportion of the 
population can manage the job after fairly brief training (cf. CraOt, 
1945). 

The desirability of the conception we are advocating is already 
generally recognised in education., Vocational and education^ 
guidance should be a continuous process from 11 years to adult¬ 
hood. At each stage in a child’s career the most suitable type of 
future schooling, or the most suitable vocational plan should be 
considered. Moreover, the predictive value of the various school 
examinations should be followed up in just the same way as that 
of vocational tests, In many industries, again, the selection and 
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training of apprentices, or their transfer to other simpler jobs if they 
fail to make good progress, are combined in a single department. 

Validation 

The methods by which educational and vocatipnal classification' 
are carried out are neither new nor mysterious. Psychological tests, 
interviews and questionnaires are simply refined and standardised 
versions of the methods to which schools and industries have long 
been accustomed. Thus, the main difference between psychological 
and other procedures is that the psychologist insists on validating 
his instruments and testing his tests, instead of relying on ‘ ‘hunches, ’ ’ 
or on “experience” and uncontrolled observation. He is aware of 
the fallibility of human judgment (including his own), of the 
tendency to jump to conclusions and to generalise from a few 
instances, to reason in accordance with our sentiments and com¬ 
plexes rather than in accordance with the facts, and to over¬ 
simplify the personalities with which we have to deal. Investiga¬ 
tions of gradings or assessments of personality traits have shown 
both how widely different judges of the same person may differ, 
and how influential is the “halo effect,” that is the impregnation of 
a judge’s gradings of specific qualities with his general good or bad 
impression (cf. Vernon, 1938a). The ordinary man or woman 
fails to realise, for example, that because a boy cheats in his work 
at school he is not necessarily destined for a life of crime, and that 
this peccadillo probably, has no bearing at all on his mechanical 
skill. An untrained interviewer or personnel official is often influ¬ 
enced by even less relevant characteristics—a smiling face, a large 
jaw, hands in the pockets, a girl’s use of lipstick, and so on. 

Predictions of vocational suitability are often based not irierely 
on subjective impressions but on such tests as schobl examinations, 
or on actual performance in some job. This information too may be 
extremely misleading unless it is scientifically validated, and proved 
to measure the qualities needed, A very common and insidious 
error, of which doctors and psychologists themselves as well as 
non-scientists are guilty, is the “naming fallacy”—that is the 
assumption that a test,is relevant if it has the same name as a 
quality involved in the job. The radar operator, for example, needs 
good “vision,” but no one has bothered to investigate whether the 
aspects of vision tested by the medical officer are actually the same ' 
as those used in the work. Applied psychologists in Germany were 
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always very prone to list the hypothetical abilities involved in a job 
and to assume that, by devising one or two tests of each of these, 
they could predict job ability. British and American psychologists 
are generally more empiricallyrminded, and it is fair to claim that • 
they know more about the predictive value and the limitations of 
their tests and other methods than do any of the hosts of organisa¬ 
tions which conduct educational, professional, trade or other 
examinations. 

Validation and follow-up investigations are carried out with: 

(i) Vocational procedures as a whole. 

(ii) Each part of a procedure, such as the separate tests, inter¬ 
views or questionnaires. 

(iii) The separate items that conapose a test or questionnaire, the 
aim being to improve the instruments by choosing the best 
items or revising the ineffective ones. 

It is realised too that procedures or parts of them may have 
satisfactory validity in relation to one type of work, yet not to 
another superficially similar type. Hence they should be validated 
afresh for each job, and the process should continually be kept up 
to date with alterations in the jobs. 

Thus, any system of classification which does not provide for the 
' validation and follow-up of its methods, and which is not con¬ 
tinually trying to improve these methods in the light of their 
results, should be regarded as little better than quackery. We 
entirely agree with Macrae (1932)' and others who hold that 
successful guidance is, in many respects, an art as well as a science, 
and with Burt’s (1942-3) admission that the intuitions of the quali¬ 
tative or clinical investigator have often led to more original 
advances in psychology than have the quantitative analyses of the 
statistician. Nevertheless, it is essential for the vocational psycho¬ 
logist to be trained in scientific techniques of investigation and 
statistical methods, so that even if he is not himself responsible for 
follow-up he can realise the importance of statistical checks and 
can interpret the significance of results obtained by others. 

Training of Personnel 

Since 1942 the supply of trained psychologists has been quite 
inadequate to cope with the demands of the universities, the 
Forces, education authorities and clinics, industrial and other 
employers, and these demands appear to be increasing. It is essential 
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therefore to hand over much of the day-to-day work of classi¬ 
fication to less highly-qualified staff, as was done quite successfully 
in the Forces. There are, indeed, positive advantages in using 
officers who know more about the Navy or Army, teachers who 
know more about education, and personnel officials who know 
more about industry, than the psychologist himself is likely to 
know. Moreover, as already indicated, the psychologist prefers to 
educate organisations to carry out appropriate vocational schemes 
and to give technical help where needed, rather than to impose a 
scheme run by technicians. The system of using non-psychological 
staff worked well in the Navy since the recruiting assistants and 
P.S.O.s were very carefully selected and trained in the first place, 
were kept rinder constant supervision, and were fed with up-to- 
date information on naval requirements, job analyses and psycho¬ 
logical techniques. It was only slightly less satisfactory in the Army 
because of the very large numbers involved, and because the less 
effective powers of the qualified psychologists made it difficult to 
maintain as high standards. In both Services, however, a check 
was kept on each member of the staff; for the work of a P.S.O. who 
interviews and makes employment recommendations is a technique 
which requires validating just as much as is an intelligence test. It 
was often possible to eliminate the weaker ones or to transfer them 
to duties for which they were more suited. Problems both of over¬ 
work and of boredom were often serious, but could be avoided 
under conditions which reduced the testing and interviewing load 
and allowed the personnel to be trained for more responsible and 
more varied tasks, such as follow-up, documentation, job analyses, 
and research investigations. 

Large-scale applications of vocational classification in schools, 
such as the Birmingham and Ayrshire experiments, have relied 
mainly on the part-time work of partially-trained teachers, and 
short courses given by the N.I.I.P. have helped to provide moder¬ 
ately well qualified careers masters and mistresses for schools all 
over the country. But there are very grave dangers in such schemes 
being undertaken by insufficiently skilled personnel, dangers not 
only to the pupils or employees who are misdirected, but also for 
the reputation and progress of the whole of vocational psychology. 
At present an employer or education authority has no easy means 
of distinguishing a competent person from a quack. A degree 
in psychology or possession of Associateship of the British 
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Psychological Society does not necessarily show ability to under¬ 
take selection or guidance. Nor, on the other hand, are the many , 
persons who acquired valuable experience in the Forces and 
carried out excellent classification work necessarily competent to 
organise civilian schemes on their own. Presumably what is needed 
is a central training institution conferring diplomas which would 
be as widely recognised as the London School of Economics’ 
Mental Health Certificate for psychiatric social workers (cf. 
Rodger, 1944). It would be desirable too for such diplomas only to 
cover a limited period of years, a rrfresher course and a re-examina¬ 
tion in knowledge of techniques and jobs, and in interview in g 
skills, being required before they were re-issued. Fortunately, 
psychologists are generally aware of the present unsatisfactory 
state of affairs, hence there is reason to hope that steps will be 
taken both to increase the supply of trained workers and to assess 
and maintain their standards of efficiency. 

A further problem which has not yet been studied at all fully, 
is the ethical one of the use toVhich information amassed by 
psychologists or their assistants may be put. Often such'informa¬ 
tion deserves to be treated with as great confidence as that received 
'by a doctor or solicitor. It would be intolerable, for example, if 
children’s test scores, or material elicited in clinical interviews, 
were passed on to employers. We did not, in the Forces, even 
allow the information we possessed about recruits to be handed to 
other Government departments without the recruits’ permission, 
since it had been collected not voluntarily but under conditions of 
Service discipline. At the same time the psychologist is not in the 
position of a doctor who can state: “X is not fit to do this type of 
work, but I will not disclose what is the matter with him.” The 
solution to such problems appears to lie, not so much in the 
development of still more reliable and valid techniques, as in the 
establishment of vocational psychology as a profession such that 
.the judgment of a psychologist with appropriate qualifications will 
be accepted as readily as thosie of his medical or legal colleagues. 
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THE VALUE OF VOCATIONAL CLASSIFICATION 
PROCEDURES 

Abstract. —^Validation, of teats or other procedures necessarily 
involves such statistical concepts as correlation coefficients, partial 
and multiple correlation, aind reliability. Procedures may be com¬ 
pared with a variety of criteria of proficiency, i.e. of success or 
failure at the work, including; 

(a). Objective records of output, accidents, etc. 

(J). Differences between groups of known proficiency. 

(c). Results of examinations, trade tests, etc. 

. (d). Merit ratings or other subjective gradings. 

Examples of these types are listed. Some of the difficulties in 
collecting reliable information on proficiency under Service, or 
under industrial, conditions are pointed out. Large numbers of 
cases are needed for adequate validation studies, and the variability 
of results in different studies should be checked. Proficiency often 
depends on a great many factors besides the suitability of the 
individual workers, and these must be controlled as far as possible. 
The workers whose success or failure is followed up often con¬ 
stitute a selected group, and this seriously distorts the results 
obtained. Thus, in judging the value of vocational classification 
these technical difficulties must be borne in mind. 

Pre-war research has demonstrated the effectiveness of voca¬ 
tional guidance as developed by the N J.I.P.; but substantiation 
of student counselling and child guidance procedures is so far less 
adequate. Statistics of the proportions of satisfadtory trainees or 
of officers in the Services, who We been selected by psychojogical 
methods, do not suffice to prove the value of these methods. But 
several studies are cited where recruits so selected were found to 
be more successful than those selected by older methods. Though 
the follow-up, and assessment of the ultimate proficiency of, 
officers are particularly triclcy, sorne fairly satisfactory results were 
obtained on the validity of W.O.S.B. procedures as a whole. 
Individual board members showed considerable variations in skill, 
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and much more investigation is needed (and is now being carried 
out) into the reliability and validity of the separate parts of board 
procedures. 

The importance of validation was pointed out in Chapter VI. 
Inevitably the subject is a somewhat complex and technical one, 
but we will attempt to describe its main principles and difficulties 
as simply as possible. Thereafter we shall present some of the 
evidence for the value of vocational adjustment procedures as a 
whole. Certain essential concepts such as correlation and reliability 
have been discussed elsewhere (cfi Vernon, 1940), and are fully 
treated in moat statistical textbooks. Thus brief definitions will 
suffice here. 

Correlation .—^When any two sets of scores or gradings are com¬ 
pared, for example results on a selection test and subsequent 
degree of proficiency, the closeness of agreement can be expressed 
as a correlation coefficient ranging from -|- 1-0, meaning perfect 
agreement, to'0-0 meaning no agreement at all. If those who do 
well at the test tend to do badly at the job, the coefficient will be 
negative, though this rarely occurs. More often coefficients are low 
positive (+ -2 to -f '3), moderate (-f *4 to + -6), or fairly high 
(+ -6 to -|- •?), though far from perfect. A note on the implication, 
or concrete significance, of the size of correlation coefficients is 
appended at the end of this Chapter. 

Partial 'Correlation may be used when it is desired to find 
whether some additional test, examination mark or grading con¬ 
tributes anything to the prediction of some criterion, say job pro¬ 
ficiency, which is not already covered. Suppose, for example, 
pupils are selected for secondary education solely by an Fnglish 
examination, it is likely that the addition of an arithmetic examina¬ 
tion would cover rather more of the ground, provided that FTigHsTi 
and arithmetic marks do not themselves overlap very closely. If 
the pupils’ marks on the two subjects correlate perfectly, then 
obviously either will predict as efficiently as the two combined. 
The partial correlation of arithmetic shows its extra contribution 
when English is, as we say, “held constant.” 

Selection is generally based on several tests, etc., for example, 
English, arithmetic and an intelligence test, and their combined 
predictive value is shown by their multiple correlation coefficient. 
As some tests, considered singly, are likely to be more valid than 
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others, the better ones should, of course, receive more weight. By 
means of statistical analysis a multiple regression equation can be 
found showing the most appropriate weighting for each test, and 
by summing these weighted scores the highest possible multiple 
correlation is obtained. 

Reliability refers to the stability of test scores, or their trust¬ 
worthiness, quite apart from whether the test measures any 
particular ability or predicts any job proficiency. If testees’ scores 
alter considerably on taking the test a second time, or on taking 
a second exactly parallel test, the reliability (shown by the correla¬ 
tion between their two sets of scores) will be low. Sudi an untrust¬ 
worthy test is unlikely to possess good validity either. When both 
a test and a criterion have reliability coefficients of -6, the maximum 
possible validity of the test will be *6 instead of l-O. A good test of 
adequate length should have a reliability exceeding *9, and cer¬ 
tainly not less than '75*. More often the criteria against which 
selection procedures are validated show poor reliability, as when 
the correlation between judgments of the efficiency of the same 
men by two instructors reaches only -f *6. 

Statistical Significance .—^There are such big differences between 
different individuals that results based on only a few cases— 
whether average test scores or correlations, etc.—are apt to vary 
also; or to be untrustworthy. Thus, in one group of, say, six men 
there might be perfect agreement between an aptitude test and 
proficiency, in another group of the same size no agreement at all. 
But the extent of such chance variations can be calculated and the 
trustworthiness of a result determined. A correlation coefficient is 
said to be statistically significant when we have proved that it would 
be most unlikely to have arisen by chance. Similarly the difference 
between the scores of two groups, or between the correlations 
obtained in two groups, can be shown to be significant and reliable, 
or not so. 

types of Criteria in Validation 

A great variety of criteria have been used in validating guidance 
and selection procedures. The following classification is based on 

* This account is greatly over-simplified, since reliability coefficients depend 
on numerous factors, such as the population tested, and the method of calcula¬ 
tion, as well as on the teat itself. Moreover, a test with quite low reliability may 
be quite worth using if it makes a significant contribution to the validity of a 
battery. But this is not the place for a full discussion (cf. Guilford, 1946; 
Cronbach, 1647]{. , 
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articles by Farmer (1931) and Stott (1939, 1943), Viteles’s book 
(1932), and on experience gained in the Forces. 

A. Objective Records of Individual Performance 

Selection tests have been compared with the quantity of output 
in industry, or with objective measures of the quality of the work, 
with words per minute transmitted or received by telegraphists or 
taken down or typed by stenographers, with hits on a target in rifle- 
shooting, and so forth. Indir^ measures of proficiency include 
wages on a piece-rate job, accidents among operatives or motor 
drivers, breakages or spoiled work, and savings in the cost or time 
of training employees. The numbers of changes of posts, or the 
length of tenure of posts, have been used in assessing the value of 
vocational guidance. 

While such criteria are generally the most satisfactory because 
they are the most clear-cut and the least affected by subjective 
judgments of employers or instructors, yet they often show rather 
poor reliability (hence records may need to be collected over a con¬ 
siderable period), and are often distorted by factors not under the 
control of the worker, e.g. breakdowns in machinery or Trade 
Union restrictions. It is seldom, too, that comparable measure¬ 
ments can be obtained on a sufiiciently large number of cases. 

B. Differences between Groups with Known Characteristics 

Men engaged on a given job, and presumably for the most part 

competent at it, may be contrasted with others not so engaged. 
Thus, a mechanical aptitude test should differentiate between 
mechanics and clerks. Advancement is a useful criterion, as when 
ofiicers,instnictors or promoted men are contrasted withbeginners or 
“other ranks.” “ifrift to recommended job” is explained below. An 
example of an indirect measure is operating costs such as the saving 
in current on a tramway system after the institution of selection. 

C. Results of Theoretical or Practical Examinations 

School or technical examinations, trade tests, tests of elementary 
training among infantry, constitute examples. In lengthy training 
courses with several stages, the stage that a trainee reaches without 
failing may, be used. Vai^ious combinations of examination marks 
may be derived, e.g. by factor analysis (cf. p. 168). 

In all of these the judgment of one or more examiners, instructors, 
etc., enters either in the marking or grading of the candidates’ work 
or—as in “new-type” tests—^in the setting of the questions (cf. 
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Vernon, 1940). Hence variations of standards and the prejudices of 
different examiners inevitably affect the results. 

D. Gradings or Assessments 

This includes gradings based on observations of the man rather 
tlinn of his performance at particular tasks. Merit ratings in 
industry or specially collected assessments of the efficiency of 
officers or men in the Forces are the main examples. Such judg¬ 
ments are extremely liable to be biased by the social qualities, or 
conformity to discipline, etc., of the pepple being graded. Many 
forms of rating scale and questionndre have been developed in an 
attempt to reduce the effects of halo, variations in the judges’ 
standards, and other defects (cf. Vernon, 1938a). Some of'the 
methods adopted in the Forces are described below. 

Other criteria under this heading include the award of military 
decorations, the development of psychiatric breakdown, and self- 
ratings of satisfaction ■with their jobs by people given vocational 
guidance. 

Naturally there' is much overlapping. For example, trade tests 
(C) are often marked objectively and may, therefore, fall under (A). 
In many examinations, too, the examiner knows the examinees and is 
• influenced in his marking of their work (C) by his general impres¬ 
sion (D). Another classification which cuts across this is “training 
V. operational.” All types of criteria may be applied during or at 
the end of training, or else to fully trained men working at the job. 

The Collection of Validatory Data 

Some books on applied psychology give the impression that 
validation is a simple and straightforward matter of correlating 
selection test results ■with a criterion of proficiency. Actually, it is 
very rare either in industry or education that a satisfactory, “ready- 
to-wear,” criterion is available. Under conditions of practical life, as 
contrasted with the controlled conditions which the psychologist 
imposes when working in his laboratory, there are innumer¬ 
able disturbing influences which must be taken into consideration 
and, as far as possible, corrected. 

Straker (1944) has given an illuminating description of the 
' collection of data in the Navy. Least information is usually avail¬ 
able about the most interesting cases—^thqse who fail in their train¬ 
ing or at the job. The psychologist should if possible interview such 
men and their instructors in order to pin down the causes of failure, 
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whether medical, disciplinary, or lack of ability or interest. Some 
training schools get rid of a number of trainees on arbitrary and 
inadequate grounds before the training course even starts, hence 
he should work on lists of men put forward for training rather than 
on the school’s own class lists. Next he may find that several men 
have been “back-classed,” i.e. transferred to a later class in order 
to increase the effective length of their training, and- again the 
reasons may be varied—^medical or compassionate as well as 
inefficiency. Training courses often have several sections which are 
separately marked, and the psychologist must decide which of these 
are essential for his purpose. The mere final pass or fail results are 
unlikely to be adequate for statistical analysis.* When a particular 
stage of training is conducted in more than one establishment, or 
by several different instructors, it is important to study the com¬ 
parability of the sets of results. Again the psychologist’s plans are 
liable to be upset by sudden alterations in the length of the training 
course, or in the syllabus, e.g. with the, invention of some new 
piece of equipment, or by the transfer of the training to a new 
school. Although the numbers under training at any one time may 
be small, he cannot usually wait to collect proficiency data on large 
groups. He must, select and advise recruits as best he can, on the 
basis of past experience of similar jobs and on his own judgment, 
and improve his techniques as he goes along. All these difficulties 
have their parallels among school children (cf. jVIcClelland, 1942), 
and among apprentices or trainees in industry. 

Success at training is obviously a less satisfactory criterion than 
success at the job, but nevertheless usually has to be employed. 
Training may last only a few weeks but often extends to one or 
more years among skilled tradesmen and professional workers, 
e.g. doctors. In such cases the psychologist should try to follow 
the trainees throughout, but is forcted to guide his methods chiefly 
by results obtained in the early stages. Moreover, he has to select 
men who will pass training courses even if he is convinced that job 
requirements differ considerably from training requirements. 
Much evidence was collected, especially in the Army, as to the 
inappropriateness of some training courses. For example, many 

* The statistician can use any kind of results, such as rankings from highest 
to lowest, or Excellent, Good, Average, Fair and Poor grades, or merdy Pass «. 
Fail, provided the numbers are large enough. But an extended distribution of 
scores is the roost convenient, and most economical of numbers (cf. Vernon, 
1046b), . 
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recruits from civilian engineering trades could not be recom¬ 
mended for corresponding Army trades since they were too poorly 
educated to learn the required theory. Yet, in the field, these same 
men were transferred into tradesmen, without any training, and 
managed the work satisfactorily. 

When gradings (Type D) have to be used' rather than marks 
(Type C), it was found best to formulate highly concrete questions 
which forced the grader to assess the performance of each man in 
his training or at his job, instead of giving an all-round opinion or 
judgments of general personal qualities. Useful questionnaires for 
following up oificers were developed by W.O.S.B. psychologists, 
containing some twenty to thirty items each with two to four 
possible answers*. A similar but briefer questionnaire for other 
rank investigations is reproduced here. Though the filling .up of 
such a form for a number of men appears formidable, graders find 
it easy to use, since it is couched in their own language, and feel 
more confident than they do in grading general traits. It is best 
done under supervision but can be sent out and returned by post 
when the men to be assessed are scattered in many different units. 

Follow-Up Questionnaibe 
Part I. —General Soujiehly Qualities 

A. 1. However great the stress of battle, he never showed signs of cracking. 

2. He stood up quite well when things were unhealthy. 

3. He needed nursing when the going was tough. 

4. He wjis not much use in battle. ' ■ _ _ 

B. 1. In action you could always depend on him to do the right thing without 

being told. 

2. When not under immediate command, he usually kept his head. 

3. He always needed to be led and told what to do. 

C. 1. Hard conditions tended to get him down. 

2. He accepted bad conditions cheerfully enough. ^ 

3. He helped to keep up the men’s spirits when conditions were bad. 

D. 1. He is rather a disturbing influence. 

2. He gets on all right with the others. 

3. He is always very popular in any group he is with. 

E. 1. He always kept alert and,vigilant, even when direct contact with the 

enemy was not expected. 

2. He was reasonably vigilant. 

3. He was inclined to become slack afler a period of strain. 

F. 1. He is an intelligent soldier who can adapt himself to novel or unexpected 

situations. 

2. He. is bright enough, and seldom at a loss. 

5. Thisisratheradullmanwhogetsstuckwhenhemeetssomeunforeseensnag. 

•W.O.S.B. follow-up was influenced, like W.O.S.B, selection, by Field 
Theory. Hence it avoided asking for judgments of independent traits. Actually, 
however, the same conclusions as to the advantages of concrete, graphic rating 
scales had been readied by Behavloristically-minded American psychologists 
in the 1020’e (cf. Vernon, 1938a). 
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G. 1. His Bttituds to authority and to correction is apt to be “difficult.” 

2. His attitude towards his seniors is that of an ordinary good soldier. 

3. He is exceptionally loyal and co-operative towards his officers and N.C.O.b. 

Overall Assessment 

Outstanding Very High Satisfactory Passable Not up to Standard 
, (Best 3) (6) (12) (6) (3 Least Satisfactory) 

Part II.— ^Proficiency at His Job , 

Main Job. Alternative Job.. 

Operational Experience : Extensive ? Moderate ? _ Slight ?, 

H. 1. He k outstandingly effective in handling tools of his trade (arms, tools, 

equipment, etc.). 

2. He IS competent and effective enough. 

3. He is rather ineffectiye. 

J. 1. His heart has never redly been in his job. 

2. He is interested in his job. 

3. He is a real, enthusiast about his job. 

K. 1. Physically he is well up to the job. 

2. He is handicapped at his job by physical shortcomings. 

L. 1. He has a flair for improvising (with tools, materials, etc.) in an unexpected 

difficulty. 

2. He is reasonably good at making the best use of what is to hand when 
things go wrong. 

8. He is lost without the usual tools, materials, etc. 

M. 1. He has no eye for ground and cover and is apt to make clumsy mistakes. 

2. He makes reasonably intelligent use,of country. 

3. He has a real “poacher's instinct” for using ground and cover. 

Overall Assessment i 

Outstanding Very High Satisfactory Passable Not up to Standard 
(Best 3) (6) (12) (0) (3 Least Satisfactory) 

A distinct method developed by Admiralty psychologists in 1942 
might he termed the clinical method, since the graders are not 
required to give any verbal, nuiperical, or other gradings. A con¬ 
ference is held with the officers or others who have had most to do 
with the men, and a general discussion is held to bring out the 
strong and weak points of each man as a whole. The discussion is 
kept on as concrete a level as possible; such questions are asked 
as: “Which men would you choose to take with you on a small 
landing party, and why ?”' The psychologist thus tries to disen¬ 
tangle the efficiency or other relevant qualities of the men from the 
less relevant social-emotional qualities which so often bias the 
graders’ judgments. For example, cross-examination may reveal 
that a recruit would be considered excellent at his job, but that he 
is an ardent communist. From the discussion the psychologist 
decides which men are, say, outstanding, above average, fair or 
poor in respect of the qualities that concern him. He can also judge 
whether some of the raters are too prejudiced or too little acquainted 
with the men for their cases to be worth including in the investiga¬ 
tion. He should, of course, be ignorant of the men’s test scores or 
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Other selection data, so that hia gradings will be entirely impartial. 
Both the questionnaire and the dinical approaches demand a 
thorough knowledge of the job, ad that the questions may be' 
framed around its essential features. 

Often conference and rating scale methods are combined. The 
raters of several groups of men can be present simultaneously, 
since this helps to familiarise them with the procedure. But if there 
are two or more raters for any group of iiien (a highly desirable 
condition), the moat junior in rank should be asked for his opinion 
first, or they should be interviewed independently. 

Clearly the collection of good follow-up evidence requires an 
experienced and tactful investigator. For example, some ofiicers 
might resent the-imputation that any of their men are poor or 
inefficient, but may be persuaded to discriminate if asked, “Which 
men need to improve?” W.O.S.B. psychologists point out an 
important difference'between “out-group” and “in-group” judg¬ 
ments. Trainees who will shortly pass on to other units are regarded 
by their instructors or temporary officers in a very different 
way from men belonging to an officer’s own company or crew. 

Treatment of Validatory Criteria 

Only a few of the complex technical problems raised by valida¬ 
tion can be mentioned here. Often the apparently poor showing of 
vocational procedures is largely due to unreliability or inappro¬ 
priateness of the criteria. Accident rates, for example, are very 
unreliable. Even if collected over as long a period as two years, the 
correlation between the rates in the first and second year is gener¬ 
ally no higher than -I- *4. The untrustworthiness of ordinary 
written examinations is well known. Two' parallel one- or two-hour 
papers in the same subject usually correlate -6 or •?, and the agree¬ 
ment between two examiners marking the same papers may be no 
higher. Practical or oral tests are generally pooret still since they 
can only sample a small fraction of the candidates’ skill or know¬ 
ledge. Merit ratings on personal traits or general efficiency are 
commonly found to yield reliability coefficients around *6. How¬ 
ever, by combining several sets of observations, marks or ratings, a 
sufficiently precise or stable criterion can usually be achieved*. 

* When the candidates take half a doaen or more theoretical or practical exam¬ 
inations, or are assessed on several characterisdcs, factor analysis is particularly 
useful for determining the underlying qualities, and from the results a more 
clekr-cUt criterion can be built up. Such analysis will show, for exampl^ whether 
it is legitimate—as most schools and training institutions do—to add all the 
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Even when criterion scores or grades are statistically reliable, 
they may or may not be representative of the proficiency in which 
the investigator is interested. Rifle-shooting scores, for instance, 
obviously do not coincide with all-round proficiency of infantry¬ 
men. In no practical job are marks on purely written examinations 
adequate, and they need to be supplemented with assessments if 
practical tests are not available. Thus the investigator generally 
requires to make at least a rough job analysis, and should consult 
instructors or other experts, in order to interpret the significance 
of the proffered criteria. 

Human beings vary so widely, and their success or failure at a 
job depends on so many factors, that no sound conclusions as to 
the usefulness of a vocational procedure pan be derived from a few 
eases. One of the main reasons for the slow progress of vocational 
psychology has been the sm^l number of persons in any one 
civilian job available for investigation. In education and in the, 
Forces psychologists have been more fortunate. No fixed rules can 
be laid down, but a study of fifty cases or less will usually suffice 
only to give some rough indications to the psychologist himself. A 
nunimum of two hundred is desirable for proving the value of some 
selection technique, and eight hundred for providing detailed 
conclusions, e.g. on the scoring of items in a questioimaire or the 
appropriate weighting of a battery of tests. In order to render a 
conclusion twice as secure, or to halve the influence of chance 
variations, it is always necessary tO| quadruple the cases. But much 
depends on the care with wMch the data are collected, and on 
the degree of refinement or coarseness of the measures employed. 

As far back as 1924, Thurstone pointed out that random fluctua¬ 
tions due to the “luck of the draw” are much less serious than varia¬ 
tions in the results of different groups brought about by conditions 
which the investigator cannot readily control. The agreement 
between some procedure and proficiency may differ in different 
firms, schools, or Service units, because the groups are differently 
motivated, e.g. one tester obtains better co-operation than another, 
the training or education differs, the attitudes and general morale 
of the groups vary, the jobs at which proficiency is assessed— 
though nominally the same—differ sufficiently to demand different 
patterns of abilities, etc. Time and again an investigator has 

marks together as meaaurmg a general proficiency factor,'or whether it ia better 
to regroup them into two or more seta apd to validate the selection procedures 
against each set separately. 
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obtained very promising results in a preliminary validation experi¬ 
ment conducted under carefully controlled conditions, perhaps on 
a small number of cases, only to find the value of the procedure 
very much reduced when he follows up a larger and more miscel¬ 
laneous sample. Indeed, a widely accepted rule among American 
psychologists is to insist on double validation. If the best com¬ 
bination of a certain battery of tests is worked out from the results 
of one group, this combination should be tried out and followed 
up in another group before it is put into routine practice. Such 
statements as, “Had the following pass-marks been applied on the 
following tests, X per cent, of the men selected would have proved 
successful,’’ should be abjured. 

It is becoming recognised in educational psychology that no 
experiment carried out, even on large numbers, in a single school 
is adequate, and that variations in results between several schools, 
rather than variations between individual pupils, should be 
explored by means of analysis of variance techniques (cf. Lindquist, 
1940). We have hardly begun to apply this conception in industry 
and the Forces. It would, however, undoubtedly be far more use¬ 
ful to carry out validational investigations on four groups of fifty 
each, in four firms, training schools or units, and to analyse the 
extent of the differences in these four contexts, than to make one 
investigation of two hundred cases. The latter will only tell us that 
our procedure is valid if applied to other exactly similar samples 
under identical conditions, whereas the former will tell us whether 
our results are sufficiently consistent to be applicable to other 
groups where conditions may vary as much as they did among the 
four actually studied. 

Allowance for Group Differences 

Unfortunately the statistical techniques for the type of investiga¬ 
tion just mentioned are elaborate and not widely known among 
psychologists. And as it is very rarely possible"'to collect hundreds 
of cases selected, trained, and graded for proficiency under 
identical conditions, many tricky problems have arisen in com- 
bining‘the results of numerous small groups. Take the common in¬ 
stance where the criterion consists of examination or trade test marks 
awarded at the end of training, or to pupils at the end of a school 
year, where the trainees or pupils have received their instruction 
in several more or less parallel classes. Such marks depend on: 
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(1) The examinees’ ability, knowledge and skills. 

(2) Their interest, industry and oilier personality qualities, 
including “examination nerves.” 

(3) Relevant experience or knowledge prior to .training. 

(4) Length of training course. 

(6) Goodness of instruction or teaching. 

(6) The examiners’ standards of marking and personal idiosyn¬ 
crasies. 

(7) Miscellaneous environmental, health and other conditions. 

(8) Chance factors—^the luck of the questions set. 

While this analysis applies primarily to criteria of Type C, similar 
sources of variation occur with other types. Thus, in Type D it is 
seldom possible to get large groups assessed by the same judge, 
hence the various judges’ standards of grading and their varying 
interpretations of the qualities to be graded tend to produce large 
group differences. Under Type A, weather conditions obviously 
affect the rifle-shooting scores of groups of men who fire on 
different days; accident rates depend on the exposure each group 
has lindergone, and so on. Now selection procedure only attempts 
to predict Nos. 1 and 2 on our list, and to take No. 3 into account. 

It is essential therefore to keep Nos. 4 to 8 as constant as possible 
in all the groups followed up, if the true agreement between the 
predictions and the results is to be obtained, or else, as already 
indicated, the investigation must be extended to several groups and 
the consistency of the findings analysed. In collecting follow-up 
data as full information as possible must be obtained under head¬ 
ings 3 to 8. When these factors operate randomly, neither helping 
nor hindering groups of initially good or poor quality, and when 
they produce variations of not more than about 10 per cent, iii the ' 
marks, then it can be proved that their effect is merely to reduce 
the validity coefficients slightly (cf. McClelland, 1042). But more 
often they operate selectively, for example, the initially poorer 
groups get poorer training, or alternatively they are provided with 
better instructors. This greatly distorts the correlations, occasion¬ 
ally raising them spuriously, more often reducing them consider¬ 
ably. Similarly when ah. officer or instructor assesses the proficiency 
of his men more l^ghly than does a second judge rating a second 
group, it may be difficult to teU how far the first group is really 
superior in quality, how far the difference is due to No. 6 on our 



VOCATIONAL CLASSIFICATION PROCEDURES 115 

list, and there is usually no means of allowing for its effect on the 
correlations*. 

Selectivity 

Psychology resembles the other sciences in that the only sbuhd 
way it has of determining the effect of a certain condition is to 
observe the results of applying that condition, all other factors' 
being kept constant. Thus the ideal method of validating a voca¬ 
tional procedure is to compare the results in a group subjected to 
tVii's procedure with the results obtained in another group precisely 
similar in all respects except for the absence of the procedure. This 
often raises insuperable practical difficulties. One may compare 
output or examination results before and after the introduction of 
a procedure, but almost always the training given to the second 
group, or the job itself, will also have altered, or other changes in 
the examiners, the morale of the workers, etc., will have occurred. 
If gradings are used, they will depend more on the graders’ 
opinions of the procedure than on the efficiency of the men. An 
employer’s satisfaction or dissatisfaction with the men sent him at 
different periods does nol constitute scientific evidence. Every 
effort should be made then to follow up simultaneously a group 
subjected to the procedure and a control group; and when subjec¬ 
tive judgment enters into the criterion, the examiners or graders 
should not know which men belong to which group. 

Naturally, however, neither the Services, nor employers or 
education authorities, can usually afford to accept such unselected 
trainees, when it is certain that a large proportion of them will fail, 
and the psychologist has to make do with a very unsatisfactory 
alternative. In effect he grades his candidates, at the time of adjust¬ 
ment, into more and less suitable groups and subsequently finds 
whether the former are superior to the latter on the criterion. Thus 
he is forced to deal wiffi selected, and unduly homogeneous, 
groups. He can study the value of, say, tests for secondary school 
pupils or mechanics only by following up the careers uf pupils or 
mechanics who were originally chosen largely on the basis of these 
tests. Not only are the differences between his more and less suit¬ 
able groups likely to be smaller than the differences between all 

* McClelland does, indeed, provide a method of scaling the r^ults from 
numerous groups which brin^ them to a common standard, but this is applic¬ 
able otdy under conditions which never operated in the Forces, namely the exis¬ 
tence of some test taken by all the groups which, correlates very highly with their 
ability. 
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his selectees and a control group—Whence his validity correlations 
are greatly reduced—^but also all other correlations tend’to be 
distorted to a varying extent by this selectivity. Suppose, for 
example, he selects men on the basis of previous education, but 
applies selection tests with a view to choosing the best tests and 
substituting them later for the education: he is almost sure to find 
the tests agreeing better with proficiency than does education, and 
yet their agreement will certainly be poorer than it would have 
been had die men been entirely «»selected. In other words, any 
test or selection procedure not used for selection appears to give 
better predictions than procedures which have been used, yet no 
procedure can display its full predictive value unless—^a rather 
unlikely contingency—^it is quite independent of, or uncorrelated 
with, the procedures actually employed. Again it follows that the 
better the selection scheme already in dperation, the poorer will its 
validity appear to be when a selected group is followed up. In one 
experiment in a new mechanics branch in the Navy, the first 300 
men sent for training were almost unselected since little was known 
about the job requirements. "Wth the next 300 more information 
was available and standards were raised. With the next 600 there 
was still further improvement in the selection procedure. But the 
correlation coefficients between T2 and course results in these 
three groups were -69, '63 and ’38 respectively, and all the standard 
naval tests showed similar declines in validity (cf. Vernon, 
1946b). 

There are, it is true, statistical formulae which can theoretically 
correct for this selectivity and give the correlations that should 
have been obtained in an unselected group (Burt, 1943b). Their 
weakness is, however, that the precise basis of selection is seldom 
known. Some psychologists or P.S.O.S may have relied largely on 
test scores, others on previous education or experience, others on 
judgments of temperament and interests. The formulae only work 
when all the candidates have been chosen on a particular test or 
battery of tests, so that the exact amount of homogeneity (restric¬ 
tion in range of ability) can be determined’*. It dfien happens, for 
example, that P.S.O.a allocate to a certain job men with low test 
scores, but with compensating qualities of keenness or relevant 
experience, and when these men do well at the job, they reduce the 

* They ysBume too that dl the variables involved give normal distributions of 
scores: this is not necessarily true of all selection tests and is usually untrue of 
such factors as experience. 
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size of the correlations between the tests and the criterion. No 
method of correction is adequate in this situation. 

In conclusion, it may be seen that validation only approximates 
to being an exact science. In most practical situations the validatory 
results are affected by so many uncontrolled influences that a large 
measure of statistical judgment is needed in arriving at sound con¬ 
clusions. Often when these results seem disappointing this may 
not be the fault of the psychologist or of his methods. Sometimes 
also when they seem rather favourable they need to be accepted 
with caution; the samples may have been too small, or unduly 
heterogeneous, or other defects may be present. We must hope 
that, as the principles outlined above come to be more widely 
understood, it will be possible to plan better experiments which 
will yield more clear-cut answers to validation problems. 

Previous Validatory Studies 

It if difficult to find any good instances in the literature of 
scientific validation of vocational selection procedures which 
employ interview and other techniques besides tests, though there 
are numerous claims, not very well supported, as to the benefits of 
introducing psychological methods into industrial organisations, 
and thousands of studies of tests. Vocational on the other 

hand, having been conducted largely by academically trained 
psychologists, has always laid more stress on thorough validation. 
A useful summary of British investigations to date is given by 
Stott (1943). 

Of the cases who had received full-scale individual guidance, 
from the N.I.I.P., 463 were followed up three to five years later. 
From replies to a questionnaire or interview, the numbers who 
were “successful and happy in their work” were judged. Of the 
346 cases in jobs which accorded with the Institute’s recom¬ 
mendations 92 per cent, were successful, whereas of the 118 in 
“non-accordance” jobs only 57 per cent, were successful. Con¬ 
sidering that individuals seeking guidance from the N.I.I.P. tend 
to be “difficult” cases, whose vocational suitability is very uncer¬ 
tain, the 92 per cent, success is highly gratifying. So many con¬ 
ditions may alter in three to five years that a larger percentage 
could scarcely be expected. 

Investigations which conform more closely to what we have 
called c lassificati on, rather than guidance or selection, are those 
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of Rodger (1937) in a Borstal institution and of Hunt and Smith 
(1945) at Birmingliam. Rodger compared assessments by instruc¬ 
tors of the efficiency and progress of 200 youths allocated to work 
parties in the usual manner by their housemaster, and 200 allo¬ 
cated by himself on the basis of tests and short interviews. Of the 
former 46-6 per cent, and of the latter ,69-6 per cent, were given 
Grade A reports. While the superiority of the psychologist’s cases 
is not very striking, it is probably reduced by the unreliability of 
the instructors’ gradings, and other factors. 

In the largest Birmingham enquiry, 1,630 children were followed 
up over two years and 603 over fom years, roughly half being boys, 
half girls. Half of these had been given guidance by psychologically 
trained teachers on leaving school, the rest—a control group—had 
only received advice at ordinary employment conferences. Numer¬ 
ous aiteria were employed, including employers’ gradings and 
length of tenure of posts, all of which gave favourable results. The 
moat interesting figures are listed in Table I. Among the “guided” 
cases, 90 to 93 per cent, of those in recommended jobs state that they 
are satisfied, and only 20 to 33 per cent, of those in “non-accord¬ 
ance” jobs. Of the controls, however, the cases who followed the 
employment conference’s advice are if anything iess satisfied than 
those who did not. Many more guided cases retain accordance 
than non-accordance jobs over a long period, but the controls 
show practically no difference. Finally, it is noteworthy that as, 
time passes a larger proportion of guided cases are found in 
accordance posts; that is they show a “drift to recommended job,” 
which is absent in the less skilfully advised controls. 


Tabib I.— Results of Birminoham Vocational Guidance Experiment 



Ptychologically 
Gmded Group 

Control Group 


Accordance 

jobe 

Non¬ 

accordance 

jobs 

Accordance 

jobs 

Non¬ 

accordance 

jobs 

Satisfied with job at end of 

per cent. 

per cent. 

per cent. 

per cent. 

2 years (aelf-iatinas) 


26 

64 

76 

Ditto at end of 4 years . . 

93 

33 

64 

78 

Retainea first job over 2 years 

60 

Ilf 

37 

33 

Ditto, over 4 years 

In accordance jobs at end of 

46 

11 

27 

26. 

2 years .... 

87 

’ 

62 


Ditto, at end of 4 years 

92 

— 

47 

— 
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Student counselling procedures in American schools and col¬ 
leges do not yet seem to be so well validated. Thus, Williamson 
(1936) summarises several investigations in which arts and 
engineering students who were counselled by faculty members 
obtained no belter college grades than matched controls. But it is 
improbable that these counsellors had received any psychological 
training. In a later research (Williamson and Bordin, 1940), 80 per 
cent, of students who had received counselling and 66 per cent, 
of a control group were assfessed, one year later, as emotionally 
well-adjusted. The difference is rather small, yet statistically 
significant. There have been numerous follow-up studies of child 
guidance, though few have taken the essential precaution of includ¬ 
ing a control group in order to discover what proportion of 
maladjusted children improve without receiving guidance.'' Never¬ 
theless, C. R. Rogers’ book (1939) shows that the effectiveness of 
different types of treatment, e.g. boarding out in foster homes, is 
now receiving scientific study. Burt and Lewis (1946) briefly 
mention the application of analysis of variance to the evaluation of 
different modes of treatment of delinquents and neurotics, one 
result of which seemed to be that the personality of the therapist 
is more influential than the type of treatment he adopts or the 
'psychological theory on which it is based. Typical of many inter¬ 
esting investigations of the effects of different systems of upbring¬ 
ing on the personalities of the children is that of Baldwin, Kalhorn 
and Breese (1946), proving that a democratic rather than auto¬ 
cratic atmosphere in the home produces greater intelligence, 
originality, curiosity and tenacity, and that over-indulgent or 
rejectant attitudes among the parents have unfortunate effects. 

The Value of Personnel Selection Procedures as a Whole in the 

Forces 

Information on the failure rates of practically all recruits trained 
for different A.T.S. jobs in 1944 was collected, and it was con¬ 
cluded that 94 per cent, had been successfully allocated by P.S.O.s 
to jobs they could manage. A similar enquiry covering over 8,000 
Anny recruits, in several representative Arms, yielded failure 
rates averaging less than 6 per cent. Such figures, however, do not. 
constitute evidence, since the equivalent failure rates for recruits 
posted before p.S.P. was.established are not known. Also the 
quality of recruits may have risen or the standards of the training 
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schools may have been lowered. If they show Anything at all it is 
that the Army and A.T.S. authorities were mostly converted, by 
1944, to approval of personnel selection. 

Much more convincing was a comparison of several groups of 
auxiliaries selected by P.S.O.s with odier groups selected by other 
methods and gained simultaneously. Their respective failure rates, 
given in Table .II, show great improvement attributable to per- 


Table II,— ^Failure Rates of A.T.S. Tbainees Selected by Diffbbent 

Methods 


' Category 

No. 

Selected by 
Old Methods 

Per cent. 
Failed 

No. 

Selected by 
P.S.O.S 

Per cent. 
Failed 

Drivers 

124 

30 

1,004 

14 

Clerks 

12S 

11 

602 

4 

Special Operators 
Operatora, .Wireless and 

420 1 

60 

130 

7 

Line 

217 ! 

7 

187 

0-6 


sonnel selection. Special operators were particularly striking, since 
improvements in selection procedure were introduced in several 
stages, and with each alteration there was a large drop in the 
failure rate. 

War Office and Admiralty psychologists were especially inter¬ 
ested in the selection of tradesmen and mechanics, and collected 
some of their most strikihg evidence in this field. Hotoph compared 
the failure rates on training courses of some 10,000 Army trades¬ 
men selected by four different procedures during four months of 
■' 1942, with the results listed in Table III. 

Apparently the P.S.O.s’ selectees had only about half the overall 
failure rate of men selected by ordinary Army methods and a third 
that of men selected by the Ministry of Labour. However, the 
comparison is somewhat unfair in that more of the third than of the 

Table III. —^Failtibe Rates of Abmy Tewesmen SsLECTEb by Different 

Methods 


Method 

No. 

Overall 

Failure 

Fate 

“Adjusted' 

Failure 

Rate 

Nominated by C.O. or technical officers 

3,606 

per cent, 
17-0 

per cent. 
19-2 

Nominated at own request 

3,176 

16-6 

ig'6 

Called up by Ministry of Labour aa 
semi-qualified tradesmen 

on 


19-4 

Selected by P.S.O.s 

2,201 ■ 

■131 

11-1 
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fourth group happened to go to trades where failure rates were 
abnormally high. An attempt was made to adjust for this and to 
equalise the numbers of each group in each trade, -mth the results 
given in the last column. It will be seen that the P.S.O.s’ selectees 
are still the best, though less strikingly so, and that Army and 
Ministry of Labour cases achieve much the same rates. It should 
be remembered that Army P.S.O.s had little psychological or 
industrial training and did not, at that time, use any standardised 
trade tests or tests of trade knowledge. Hence the main reasons for 
their success must lie in the careful checking, at interview, of the 
candidates’ previous experience, and in the rejection of candidates 
who failed to get fairly high scores on intelligence and other group 
tests. 

In the Fleet Air Arm over 16,000 mechanics and fitters were 
followed up. Over 6,000 were selected by standard methods before 
psychologists played any part, and their overall failure rate on 
course was 14*7 per cent. When selection was undertaken by 
P.S.O.s, working under a psychologist, not only did the failure 
rate for 10,000 men drop to 4'7 per cent., but also they extracted a 
much larger proportion of trainees from tiie available naval recruits 
without denuding other mechanical branches which were likewise 
making large demands at that time. Table IV shows that improve¬ 
ment was obtained in every category. While the comparability of 
the training courses and examinations in the two periods cannot 
be ensured, careful enquiry revealed no grounds for supposing that 


TABI.E IV.— Failuhb Rates of Fleet Am Ahm Mechanics and Fittehb 
Selected by Different Methods 


' Category 

Old Method 

Psychological 

Method 

No. of 
men put 
on course 

Per cent. 
Failed 

No. of 
men put' 
on course 

Per cent. 
Failed 

} 

Air 

Airframes 

1,333 

16-8 

2,234 

6-9 

Mechanics 

Engines 

1,219 

21-2 

2,218 

6-6 


Ordnance 

902 

7-3 

1,696 

0-6 


Electrical 

832 

10T 

1,969 

4-9 

Air 

Airframes 

963 

10-X 

723 

4-3 

Fitters 

Engines 

890 

16-6 

743 

6-4 


Ordnance 

130 

13'1 

190 

2-0 


Electrical 

271 

9-2 

mmm 

1-8 

Total , 

. 

6,630 

14-7 

10,008 

4-7 
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there was any relaxation of standards during the period when 
psychologists took over. It has been calculated that this reduction 
in training wastage resulted in a financial saving of fully 100,000 
per annum. 

A point of interest which never seems to have been investigated, 
except at W.O.S.B.s, is the relitMlity of vocational guidance or 
selection procedures as a whole. If the same group was tested 
and interviewed by two psychologists how closely would their 
recommendations agree? However, reliability is of little impor¬ 
tance provided that validity is adequate. 

Validation of War Office Selection Board Procedures 

Fitts (1940) points out that there is no scientifically acceptable 
evidence whatever as to the value of the German officer selection 
procedures. A great deal of work was done in this field by the 
British Army Research and Training Centre, and this will be 
reviewed here since it was concerned chiefly with the value of the 
selection as a whole rather than with th^t of the separate tests or 
separate board members. If its results are somewhat disappointing, 
this may be attributed largely to the following difficulties: 

(1) Owing to the lack of technical control, different boards used 
different methods and had different standards. It was hardly 
feasible, therefore, to treat candidates graded high, medium 
or borderline from different boards as equivalent. Again the 
methods and the recording systems changed at intervals 
during the war. Pass rates dso varied markedly with opera¬ 
tional requirements. 

(2) The “selectivity” of the population was so high that even 
highly valid procedures would be expected to achieve only 
moderate or low coefficients. Only men regarded as promis¬ 
ing by P.S.0.8 or by their C.O.s reached W.O.S.B.s at all, 
that is barely 10 per cent, of the Army'population. Some 
60 or 80 per cent, of these candidates were rejected by the 
boards, and it was, of course, impossible for rejects to be 
sent forward in order to prove what bad officers they would 
make. Another 6 to 20 per cent, of accepted cadets failed in 
their O.C.T.U. training, and during subsequent field train¬ 
ing an unknown proportion of the less suitable got'shunted 
off into relatively unimportant jobs. Thus, only the final 
creamj amounting perhaps, to 60 per cent, of W.O.S.B. 
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acceptances or about 16 per cent, of W.O.S.B. candidates 
reached front line duties. 

(3) Men passed by any one board often go to different O.C.T.U.s 
where standards differ and where, very often, the instructors 
who know them best are shifted away to new jobs before the 
follow-up investigator can catch them. Thus it is impossible 
to collect any large group who have both been selected, 
trained and assessed uniformly. At the operational stage 
this scattering is, naturally, even mote marked. Only one dr 
two cases, probably passed by different W.O.S.B.s and 
trained by different O.C.T.U.s at different periods, ^e 
likely to be found in any one unit. 

(4) While i nf ormation is rather more easily collected from 
O.C.T.U.s than from the field, it is extremely doubtful 
whether a cadet’s O.C.T.U. grades correspond closely with 
his ultimate value as an ofl5,cer. An American investigation 
(cf. Jenkins, 1947) gave a correlation of only -|- *16 jh -076 
between officer school and subsequent combat ratings*. A 
follow-up of 329 officers in various British units, four to 
thirteen months after commissioning, i.e. before they had 
been in action, yielded correlations' averaging -f- '26 
between O.C.T.U. passing-out grades and C.O.s’ opinions. 

(6) Although a technique was eventually evolved for securing 
reasonably reliable judgments from C.O.s it was found that 
their gradings were greatly affected by the subalterns’ age,' 
and the length of time they had been commissioned, and 
that they varied in different Arms. For accurate validation, 
therefore, it was desirable to secure large groups in which all 
these factors were held constant—a practically impossible 
task. 

One successful comparison of old boar4 and new board cadets 
was made possible by the fact that, during 1942, several boards of 
each type were running simultaneously before the old ones were 
finally superseded. Ratings were collectedion some 1,200 cadets by 
follow-up officers who held conferences with the O.C.T.U. 
instructors (who did not know which cadets came from which 
board). The gradings were reduced to a three-point scale with the 
results shown in Table V. 

* It is not clear how this figure was r«8ched. From the published tabulations 
of Officer School leadership ratings and subsequent combat ratings, the present 
writers obtain a tetrachoiic correlation of '32. 
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Table V.—^Thaining Results of Officer Cadets Selected by Traditional 
AND BY Psychological Methods 


Board 

No. 

Above Average 
Grades 

Average 

Below Average 
and Fail 

Old . 

491 

per cent, 

22*1 

per cent. 
41-3 

per cent. 

36-6 

New 

721 

34-S 

40-3 

26-2 


The selectees from seven out of the eight new boards were 
superior to those from all five of the old boards, and this superiority 
was found (though not always to a statistically significant extent) 
in all ten O.C.T.U.s, representing different Arms. Actually the 
result is less favourable than it looks. A true correlation cannot be 
calculated, but it is unlikely that the figures given in Table VIII 
correspond to a higher coefficient than about + -2. Nevertheless, 
the combination of slightly more valid selection procedures with 
greater attractiveness to candidates resulted in the sending of two- 
and-a-half times as many above-average cadets to O.C.T.U.s as the 
old boards would have done, within five months, after the establish¬ 
ment of new boards. 

The follow-up of 329 officers mentioned above, some of whom 
were old and some new board products, yielded almost identical 
results at O.C.T.U. Any differences between the two groups on 
,C.O.s* ratings in their units were too greatly masked by age. Arm, 
and length of commissioning to be calculable. A small-scale follow¬ 
up of thirty-six R.A.F. candidates for administrative commissions 
gave promising results. These men were both interviewed by a 
R.A.F. interview board and put through the W.O.S.B. procedure. 
Their grades at the end of eight weeks’ O.C.T.U. are said to have 
correlated fairly closely with the W.O.S;B, predictions (how closely 
was not published), not at all with the interview board predictions. 

Several later enquiries in the field showed that C.O.s were well 
satisfied on the whole with W.O.S.B. products, only some 7-13 
per cent, being considered unsatisfactory either in this country, in 
the Mediterranean theatre, or in the Army of the Rhine. Thus, in 
spite of the continuous drainage of the best material, W.O.S.B.s 
appear to have maintained the flow into the sixth year of the war. 
But it was clear that C.O.s’ standards were to, some extent 
dependent on what they received, i.e. that their borderline of 
unsatisfactoriness may have dropped. As no control group of 
subalterns not sent through W.O.S.B. was available, these figures 
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prove little. Similar results were obtained in an A.T.S. officer 
follow-up. A correlation of -f- -28 (uncorrected for selectivity) was 
obtained between A.T.S., W.O.S.B. and O.C.T.U. gradings, but 
the agreement of W.O.S.B. with assessments of A.T.S. officers in 
their units, although positive, was very small. 

In 1946, assessments of over 600 officers in Infantry and Royal 
Artillery were obtained from C.O.s just before the crossing of the 
Rhine, and slight but significant differences in efficiency were 
found between those who had been passed as As or Bs, Cs and Ds 
at one or another of sixteen W.O.S.B.s. For all groups combined the 
correlation between the two sets of gradings amotmts to about 
-)- *166, but if correction is made for selectivity the coefficient to 
be expected (had all W.O.S.B. candidates gone forward) rises to 
about 4- •35. It was also found that W.O.S.B. predictions were 
distinctly better for younger than for older men. Among those who 
were boarded at 23 years or under the uncorrected correlation was 
•23, and among those who were 28 or over it dropped to •Oe. 
Clearly the W.O.S.B.s had attached insufficient weight to age and 
experience—qualities which C.O.s particularly valued. Consider¬ 
ing the unreliability of the criterion and the variations between 
boards, these figures are fairly satisfactpry. 

In the post-War period, operational follow-up has been impos¬ 
sible. But a large number of investigations have proved that 
W.O.S.B. grades give fair predictions of success at O.C.T.U. 
Under properly controlled conditions, correlations consistently 
reach the -j- ^4 to -t- -6 level, though they depend greatly on the 
skill of the particular board members, and on the thoroughness of 
the O.C.T.U. assessments. 

A fundamental weakness of W.O.S.B. selection was its unrelia¬ 
bility, due to the large part played by uncontrolled subjective 
judgment. In one experiment 116 candidates were put through 
two boards a fortnight apart, and the average agreement was repre¬ 
sented by a tetrachoric correlation of only •S?. Of candidates passed 
by any one board, 21j per cent, were rejected or deferred by 
another board. Evidence was collected as to the agreement between 
one, pair of boards using closely comparable methods, and another 
pair using highly “individual” procedures, also on the correspond¬ 
ence between the gradings of different board members. This was 
not published, but it is believed to have shown higher reliability 
among the technical than the non-technical ipembers. 
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In 1946, however, an elaborate experiment, where two teams 
composed of highly experienced staff observed or interviewed the 
same 126 candidates, showed quite high reliability. Moreover, the 
agreement was at least as good when all members worked inde¬ 
pendently as when there was collaboration within, or between, 
teams. Both M.T.O.s and psychologists observed the leaderless 
group and other practical testa. Psychiatrists interviewed inde¬ 
pendently, but presidents sat in on each others’ interviews. Table 
VI shows the average agreement as to the general suitability of 
candidates in the first column. The coefficients for pairs of 
observers are high, for interviewers rather lower, and for whole 
teams the final coefficient of -80 is reasonably good. 

In addition to general suitability judgments, ratings were given 
on fourteen to eighteen carefully defined traits and the median 
reliability coefficients in the second column show a similar trend. 
The last column gives the average correlation between each type 
of member and the final gradings of the candidates obtained at a 
conference of both teams undfer an independent chairman. It 
should be noted that all the figures quoted are averages based on. 
several members, and that the separate coefficients for particular 
members often showed wide variations. Unfortunately, it was not 
possible to follow up these candidates and so to validate the 
various sets of judgments. 

The conclusion that appears to follow from this review is that 
W.O.S.B. methods, applied haphazardly according to the whims 

Table VI.— Hbliabiuty of Judgments by Membebb of Two W.O.S.B. 

\ Teams 



Mean 

Reliability 

Coefficients 

A^edian 
Aglreement 
on Separate 
Traits 

Correlation 
zoith Final 
Disposal 

M.T.O, Tvith M.T.O. 

•86 

•77 

■83 . 

Psychologist with Psychologist 

•78 

•69 

•83 

M.T.O. with Psychologist . 

•79 

— 

— 

President with President 

•66 

■68 

■76 

Psychiatrist with Psychiatrist 

•86 

•47 

•71 

President with Psychiatrist . 

-62 



M.T.O. or Psychologist with Presi- 
. dent or Psychiatrist 

•69 

— 

— 

Team with Team 

■80 

•68 

■^1 
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of the/staff are only of slight value, but that 5 vhen standard tech¬ 
niques are evolved and applied uniformly by trained and experi¬ 
enced personnel, a satisfactory reliability may be obtained. 
Only under such conditions can good validity be expected, 
and a great deal more, admittedly dii&cult, research is needed 
to prove the value of the contributions of the various 
parts of the procedure- A number of such investigations 
are now under way. 

Note on the Meaning of Correlation Coefficients 

It is very difficult to convey the practical meaning of a correla-. 
tion coefficient, say of -f -6. It does not imply 60 per cent, agree¬ 
ment. Indeed, many psychologists would say that it represents an 
accuracy .of prediction only 13^ per cent, better than pure chance. 
The rationale of the figure, 13|- per cent., is as follows. Suppose 
that we guessed the actual output on the job of an employee as 
lying at the average for all employees, our average errbr of pre¬ 
diction might be, say, 100 units. If, however, we predicted output 
from the man’s selection test score, our error would still be as big 
as 86^ units, on the average, hence the reduction would only be 
13J per cent. It is more enlightening, however, to consider what 
proportions bf good or poor men are selected by a test with a 
certain validity coefficient. 

Take as an example some job which only one out of every five 
unselected men can perform satisfactorily. If an employer has 
1,000 candidates to choose from, and takes 200 at random, only 
forty of his choices will be satisfactory. Whereas if he applies a test, 
or tests, with a validity correlation of + ’6, and chooses the 200 men 
with the best test scores, eighty-eight of them (or more than twice as 
many) will be satisfactory. This seems to represent a great improve¬ 
ment, but the following Table VII shows that, from the individual 
candidate’s standpoint, the tests are not very successful. When 
selection was random 160 Were wrongly rejected, since they could 
have managed the job, and 160 who could not do it were wrongly 
accepted. The total wastage is, therefore, 320 or 32 per cent. When 
the tests are used the total wastage falls to 224, that is a 
reduction by only 30 per cent. Table VIII similarly shows 
the agreement when the validity of the tests is very low, and 
quite high. Not until the validity correlation reaches + -74 
is the wastage halved. 
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Tablb VII.— Numbers of Satisfactory and Unsatisfactory Selectees 


A, Selected at Random B. Selected by Test Battery 



Satis¬ 
factory 
at fob 

Unsatis¬ 
factory 
at Job 

' 

Satis¬ 
factory 
at Job 

Unsatis¬ 
factory 
at yd 


Selected 

40 

160 


Passed 

tests 

88 

112 

200 

Rejected 

160 

640 

800 

Failed 

tests 




Total 

200 

800 


112 

688 

800 





Total 

200 

800 

1000 

Table VIII.— Wastage with Tests Having Low and High Validities 

A. Test Validity 

.. + -26 


B. Test Validity . 

. + -74 



Satis¬ 

factory 

Unsatis -' 
factory \ 


Satis¬ 

factory 

Unsatis¬ 

factory 


Passed 

tests 

04 

130 

1 

200 

Passed 

tests 

120 

80. 

200 

Failed’ 

tests 

136 

664 

600 

Failed 

tests 

80 

720 

800 

Total 

200 

SOO 

1000 

Total 

200 

800 

1000 


Unfortunately these Tables would not be correct if the selection 
ratio was altered, unless it became 4/6 instead of 1/6. If 80 per cent, 
of men could do the job, Table VII A would become Table IX, 
and all the others would be similarly reversed. 

Table IX.— Wastage with Hioh Selection Ratio 



Satis¬ 

factory 

Unsatis¬ 

factory 


Selected 

640 

160 

800 

Rejected 

160 

40 

200 

Total 

800 

200 

1000 


But with a selection ration nearer to the wastage would be 
greater, and the more it diverges from \ the more favourable the 
improvement produced by using selection tests. This can readily 
be seen from Fig. 1, which gives a graphical presentation of a 
cbrrelation of about + -S. Each dot represents one man’s test 
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score and his subsequent proficiency. When the selection ratio is 4, 
the quadrant A represents men correctly selected by the tests, the 
quadrant C men correctly rejected, and the wastage, shown by the 
numbers in B and D, is fairly high. But when the ratio is very 
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high or low as in Fig. lb, it may be seen that the wrong acceptances, 

B, and wrong rejects, D, are relatively few. 

Nevertheless, the percentage reduction in wastage does remam 
fairly constant so long as the selection ratio is not greater than 
84 per cent, or smaller than 16 per cent., and this covers the great 
majority of vocational and educational situations. Fig. 2 is 
borrowed from McClelland (1942). Let the ratio be h per cent, 
then the wastage when no tests are used, or when their validity is 
represented by a correlation of zero, is 2 h(100—h). 

' ■ 100 

For example, if the ratio is 1 in 6 or 20 per cent., or alternatively 

80 per cent,, the wastage is-— = 32 per cent., as shown 

in our Table VII A. For validities of -26, ‘60 and -74 the graph 
shows that the number of mistakes amounts to 86 per cent,, 70 per 
cent, and 50 per cent, respectively of this 32 per cent., so giving 
the numbers listed in Tables VIIB and VIII. This same reduction 
would occur with other selection ratios in the middle ranp. 

So far we have assumed that as many men are rejected or 
selected by the test as the proportion found unsatisfactory or 
satisfactory on the job. Naturally there is no need for this restric¬ 
tion. Although it is the commonest and most convenient plan, 
McClelland shows that, by an appropriate choice of pass mkrk on 
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he tests, the wastage can be further reduced. In fact, for each 
election ratio and each validity coefficient, there is a certain test 
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pass mark, readily determinable statistically, which produces 
minimum wastage. But to portray this situation would complicate 
matters, and the reader may be referred to the full Tables provided 
by Taylor and Russell (1939), and Tiffin (1946). 





CHAPTER VIII 


THE BIOGRAPHICAL QUESTIONNAIRE 

Abstract .—^A description is given of the Army qualification form 
and the naval biographical questionnaire, and attention is dravrn 
to precautions in the construction and use of such documents. 

Occupational experience, education, age, interests and other 
items recorded on questionnaires are shown to be significantly 
related to success in a variety of Service jobs. Examples are drawn 
from investigations of telegraphists, motor drivers, seamen, radio 
mechanics, etc. But there are considerable technical difficulties in 
turniiig such infonnation into an objective form in order to yield 
directjpredjctions ofoccupational suitability. Single questionnaire ^ 
items are very unreliable, especially among low-grade recruits, and 
they overlap in a very complex manner with one another and with 
intelligence. 


Typical of the questioxmaires used in the Forces is the Army 
qualification form, reproduced on pages 132 and 133. As may be 
seen it covers Service particulars, previous education.and employ¬ 
ment, experience that might be relevant to .Aomy jobs, results of 
psycholo^cd tests and medical examination, and the P.S.O.s’ 
conclusions. The questionnaire used in the Navy had additional 
sections on schoo l and |ei 9 yit 9 .. 1^9 interests, and on leadership 
experience (captain of a team, secretary of a club, being in charge 
of people, membership of pre-Service organisations, etc.). Though 
the questions are printed, they were read out by the tester to 
groups of recruits and' further explanations were given when 
necessary. Incidentally, the front pages of these forms served as a 
useful and adequate, although unstandardised, test of literacy. In 
the case of the 1-2 per cent, of illiterates who could not produce 
an intelligible set of.answers even under supervisions, the tester 
in charge would fill it in for them. 

Now even the most straightforward questions may be misread 
or misinterpreted, especially by duller recruits. The following list 
of precautions is based partly on experience and experiments in the 
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"Forces, partly on the large published literatuf e dealing -with the 
questionnaire (cf. K.oos, 1928; Symonds, 1931; Goodenough and 
Anderson, 1931; Good, Barr and Scatea, 1936; Vernon, 1938a, 
1939b): 

(1) The design and layout of the questionnaire should make it 
easy to follow. Bad design may lead to omissions of questions 
and to cramping of the written responses. 

(2) The instructions should be clear and simple, couched in 
words of 3 suitable vocabulary level. Questions should be 
un^biguous wd, as far as possible, concrete and spedhc. 

y A preliminary trial with a moderate-sized group of inform¬ 
ants is extremely useful for showing up obscurities. 

(3) If selective or choice-response questions are to be included, 
a trial with open questions will reveal responses which might 
not have occurred to the psychologist. Too few alternative 
answers are obviously undesirable, but too many are liable 
to confuse. Actually, owing to the danger of misunderstand¬ 
ings, it is usually better to avoid choice-response questions, 
for example, liittti^various types of post-primary schooling 
and asking recruits to tick the ones that apply to them. Such 
a form is certainly easier to code on to punched cards, or to 
subject to statistical analysis, but it is always liable to blur 
or distort the almost infinite variety of human histories. 

( When investigation^ are made, say, into the predictive value 
of previous education, the diverse responses can readily be 
classified by the psychologist. The only background ques¬ 
tionnaire in the Forces which was coded was that used by 
the W.O.S.B.S, where the informants were, of course, above 
average in literacy. 

(4) Items dealing with interests or types of vocational experi¬ 
ence are very liable to distortion, owing to vaiyin^tandards 
of judgment. Some informants will tick almost every item, 
whereas others who may, in fact, have spent as much or 
more time on the activities listed, only tide a few to which 
they are particularly attracted. The degree of interest or 
amount of expedience may be checked.in asubsequentinter- 
vievv, but standardised tests are preferable if time allows 
(cf. p. 137). 

(6) Questions which appear to involve "right” or “wrong” 
answers should be avoided in order to reduce any inclination 
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\to falsify. Again, emotional terms or prestige names which, 
may touch off prejudices or stereotypes are undesirable. If 
questionnaire tests or inventories dealing wi^. attitudes or 
person^al traits are. to be applied (cf. Chapter XV), they 
should be kept distinct from this essentially factual enquiry. 

In general4lie motivation of the informants requires careful 
consideration. T!*here should, in other words, be strong 
reasons why they should take the trouble to fill up the 
questionnaire accurately. 

(6) It should be borne in mind that the questionnaire is in 
essence an interview, to which answecs are given in writing, 
and that it requires the same skilful formulation of questions, 
the same tact, as does an interview. While some informants 
are auspicious of this impersonal instrument and reveal 
themselves more willingly in the face-to-face interview 
situation, others are “bad interviewees” who may be more 
frank in a questionnaire. 

Validity of Biographical Items 

The objectively scored application blank has long been a popular 
technique among American industrial psychologists. The relevance 
of each item is determined empirically and a scoring key prepared 
fo^use with future applicants. This method was extended to air 
crew in the Royal Canadian Air Force, to U.S. Army officers and 
other groups during the war, but was never applied systematically 
in this country. Nevertheless, the relation of questionnaire items to 
occupational success was worked out in some seventeen branches 
of the Royal Navy and Army, and the predictive value of particular 
items such as occupation, educational status, experience of driving 
or morse, age, etc., in a great many more jobs. It is not possible to 
.summarise all these results, but some specimen investigations will 
be described below. First, however, certain difficulties should be 
pointed out which are just as likely to arise in industrial or 
educational practice as in the Forces. 

(1) The validation of an item to which only two responses are ' 
possible requires much larger numbers of cases than the 
validation of a test or other extended variable. Roughly 
double as many are needed to give equally trustworthy 
correlations. Thus an investigation with less than about 400 
cases may be misleading since its indications of the relevance 
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of certain items might not be borne out in anothej: siinilar 
group. This fact renders the study of previous occupation 
particularly difficult since only quite small numbers are 
likely to have belonged to any one occupation. Occupations 
must, therefore, be classified into very broad categories, and 
in the absence of any generally accepted scheme, such classi¬ 
fication tends to be subjective and inconsistent. Significant 
differences between the proficiency of different occupational 
groups cannot be expected unless each group contains more 
than, say, 40 cases; but such groups may be attainable only if 
several heterogeneous^ occupations are combined, For 
instance, “clerks” might appear to be a straightforward 
group. Yet, in a study of naval officer cadets, accountants, 

. Civil Servants, solicitors’ and Local Government clerks were 
excellent, while works clerks, costing, railway and shipping 
clerks were below average, and bank clerks and insurance 
clerks were intermediate (an average of 76 in each group). 
The name of the occupation may also be a very poor guide, 
especially with 18-year-olds. Thus, mate may mean any¬ 
thing from a highly skilled apprentice to a low-grade 
labourer. , 

(2) \ltema, which show fair correlations with proficiency may do 
M merely because they involve intelligence,, or because they 
|)verlap with other items. Taking occupation again, clerks 
wereTound to be superior in numerous naval branches, 
including some of the mechanical ones, but this was due 
almost entirely to their high intellectual level which made 
them excellent trainees for any job. When intelligence was 
held constant they were foimd, for example, to be poorer in 
active and fighting qualities in Coastal Forces, and to provide 
an excess of failures in certain mechanical jobs. Similarly the 
, good results of those who preferred science or mathematics 
at school to geography, handwork, etc., mainly reflects the 
■ better schooling and intelligence of the former group. Even 
the advantages of membership of pre-Service organisations 
(A.T.C.,- Scouts, etc.) was found to be largely due to the 
same factors. Thus, an item analysis of a questionnaire 
should at least determine the relation of each item to 
intelligence as well as to occupational proficiency, and partial 
correlation (or analysis of covariance) should be used to 
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indicate the’ independent contributions of each item*. 

(3) As already mentioned, reliance cannot be placed on item 
scores o|;)tained from a single group. The scoring should be 
applied to a fresh group and its validity determined. 

(4) The answers given to many items are so unreliable that it is 
better to substitute tests whenever possible. Hence oral or 
written tests of trade knowledge were introduced, also prac¬ 
tical achievement tests in morse receiving, shorthand and 
typewriting. It was noted in the Navy that much more 
exaggerated claims to experience of morse were made at 
recruiting centres, where there was no likelihood of their 
being put to the test, than in H.M.S. Royal Arthur, where 
allocation to signaller br telegraphist branches was imminent. 
That genuine experience was, if anything, more valuable 
than good intelligence test scores was proved among telegra¬ 
phists, air gunners and other “communications” ratings. 
Nevertheless, examples were found of men claiming to 
receive at 8 words/min. who failed their morse training, and 
of men claiming 16 words/pnin. who required to be back- 
classed before they achieved the requisite accuracy at 20 
words/min. Experience of 4 or fewer words/min. was found 
to be practically worthless, since morse is often so badly 
taught in cadet units as to handicap the early stages of naval 
morse training. It seems highly probable that carefully 
designed, even if brief, tests of interests would similarly 
have'more diagnostic worth than merely ticking each of 
the activities in which recruits think themselves interested. 

(6) Selectivity interferes with the significance of questionnaire 
items in even more complicated fashion than it does with 
test scores. For example, when P.S.O.s have creamed the 
really experienced fitters and turners for the high-grade 
mechanical branches, the follow-up of occupation in a less- 
skilled mechanical branch is likely to show plumbers and 
sheet metal workers as superior to fitters and turners. 
Another example may be dravm from communications, 
where it was found on more than one occasion that men 
claiming mechanical interests were actually poorer than 
average as signallers pr telegraphists. This may well have 

* Preferably multiple correlation, discriminant function, or multiple contin¬ 
gency metiibds should be used in order to ^o-w for the overlapping of items 
vnth one another and with test scores, , 
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been due to the drafting of more responsible and more able 
mechanically-minded recruits to other branches. Theoreti¬ 
cally it might be possible to correct for such selectivity, but 
no simple statistical technique is as yet available for Mialys- 
ing selectivity in a large number of yes-no items. 

(6) Items other than two-response ones frequently show non¬ 
linear relationships to proficiency, so that ordinary correla¬ 
tion techniques are inappropriate. For example, neither 
recruits possessing 2-4 words/min. morse experience, nor 
those with 1-6 months’ driving experience, were ’found 
superior to men with no morse or driving experience, but 
greater degrees of experience did give an advantage. Among 
naval officer candidates those leaving school at 16-17 were 
superior to those leaving at 14-16, but those educated to 
18-20 showed no further rise. Age tends to show very 
irregular effects. In some branches 20-26 year olds were 
superior both to 18-19 year and to older men. In communi¬ 
cations age helaui 20 made for more rapid morse learning, 
but thereafter there was apparently no big drop till after 
30 or 36 years. In Army motor drivers age up to 26 was 
irrelevant, but then (at least up to 40) the number of hours of 
training needed was roughly identical with years of age., 
The significance of such variables must, therefore, be 
explored by techniques like chi-squared or correlation ratio, 
which makes it difficult to combine them, or to determine 
their overlapping, with other items. 

In spite of these ^fficulties a number of very substantial associa¬ 
tions were discovered between questionnaire items and proficiency. 
In the Navy it was particularly noticeable that leadership experi¬ 
ence was advantageous, not only among seamen arid officer can¬ 
didates, but in diverse employments such as air fitters, telegrar 
phists, radio mechanics and safety equipment ratings. Scouts, 
members of Boys’ Brigade, and the like, and of organisations such 
as the A.T.C. and Sea Cadets showed similar superiority, though 
the significance of this finding is—as pointed out—dubious. 

Table X shows some specimen correlations between occupa¬ 
tional experience and proficiency in the A.T.S., also between the 
best weighted battery of selection tests and proficiency (uncor¬ 
rected for selectivity). Although so troublesome to assess, occupa¬ 
tional success is clearly more predictive than test ability in several 
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jobs. This is particularly true in war-time when training has to be 
as short as possible. 


TAB1.E X.— ^Pbevious Occupation, Tests Result and Proficiency in the 

A.T.S. 


Civilian Experience 

A.T.S. Employment 

Correlations 
with Proficiency 



Experi- 

Test 

« 


ence 

Battery 

Clerical 

General | Overall . 




Duties J proficiency . 

■296 

■873 


Clerks 1 Stenographic . 




1 proficiency . 

•671 

•371 

Clerical 

Teleprinter Operators 

■487 

■623 

Domestic work 

Cooks .... 

•204 

■390 

Switchboard operating . 

Switchboard Operators . 

■633 

■486 

Motor diiving 

Drivers 

•610 

■376 


The collected results on driving experience from seven repre¬ 
sentative driver training regiments in R.A. or infantry are shown 
in Table XI. The final driving proficiency marks or grades are 
classified into good (the best 30 per cent.), medium (42 per cent.), 
and poor (the bottom 28 per cent., including failures). Apparently 
amount of experience beyond seven months has little infiuence. If 
a tetrachoric correlation is calculated the coefficient of *66 is 
similar to the A.T.S. result, and exceeds the multiple correlation 
of '42 obtained between the standard test battery and proficiency. 
Since experience showed negligible (positive) correlations with 
intelligence, a combination of experience and test scores should 
theoretically yield much improved predictions. Actually, owing to 
the shortage, experienced Rivera were accepted regardless of 
intelligence .levels,., and .minimum test, standards applied only to 
the inexperienced. 


Table XI.— ^Effects of Phevioos Experience on Driving Proficiency 


Driving Experience b^ore Recruitment 

N 

Proficiency 

Good 

Medium 

Poor 

Professional 

Amateur for more than 2 years 

7 months to 2 years .. 

6 months or leas, alight or casual 

Nil . 

75 

66 

68 

43 

180 

per cent. 

48 
. 47 

47 

10 

16 

per tent. 
44 

40 

41 

42 

42 

per cent. 

8 

13 

12 

40 

43 

Total. 

411 

30 

42 

28 
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, Two rather extensive naval investigations are worth summarising 
in some detail. Among 1,000 seamen and stokers undergoing 
elementary training, instructors picked out the best three and 
worst three from their own classes, yielding 136 “bests" and 136 
“worsts" in all. S.P. tests 1-4 all gave very similar correlations, the 
multiple coefficient being -444 and the correlation of T2 -433. 
Items are listed below in order of their statistical significance (T2 
being held constant by analysis of covariance). 

Age. 36 per cent, of men over 30 and only 9 per cent, of mm 
under 20 were bests; the relationship was approximately 
linear. 

Occupation. Retail tradesmen and drivers v. the rest. 

Leadership experience. 

Membership of Scouts or other boys* organisations. 

Labouring occupations ®. the rest. Here the association was 
negative. 

“Social” interests. Ticking the items playing music, concert 
party work, acting, motor driving. , 

Clerical occupations «. the rest. This showed the highest direct 
relationship to proficiency, but it was mostly attributable to 
! high intelligence. 

Service experience, in Home Guard, J.T.C., A,.T.C., Civil 
I Defence, etc. 

Education beyond 14, or—^negatively—left school at 14 in a class 
below the top. 

Athletic interests. Ticking the items football, boxing, athletics, 

. swimming or camping. 

Domestic interests. Ticking the items reading, house repairs, 
gardening, cooking. 

Emotional instability, a “P” rating gave a negative association. 
Only 43 men were so marked, but 23 per cent, of these were 
worsts as contrasted with 13 per cent, of non-'Ts.” This 
item, together with mechanical interests and job experience 
. were not significant statistically. ■ 

The relationship with proficiency and the inter-item relations 
were expressed as correlations. Multiple correlation showed that 
the most predictive items in combination were T2, age, occu¬ 
pation, leadership, scouting. Service experierlce, emotional insta¬ 
bility, domestic interests, and education. A rough scoring scheme 
for items other than T2 was made out. When the questionnaires 
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were scored in this way a biserial correlation of *461 was obtained, 
and for T2 + questionnaire score combined, the correlation was 
■633. No opportunity occurred for verifying these figures on a 
fresh group. 

One other illustration is drawn from a very high-grade nav^ 
branch, radio mechanics, for which selection was particularly 
difficult owing to its extreme specialisation. Groups of 860, 600 
and 300 were followed up at different times and the correlations of 
T2 with course results were only -338, *209 and -286 respectively. 
Multiple correlations based on Tests 2 an(i 3b were only slightly 
higher, but in the last two groups an advanced mathematics paper 
and an electrical information test gave correlations of -332 and *467, 

It was noted that all men passing their training were very high test 
scorers, but that by no means all high scorers passed; also that the 
validity of S.P. tests varied significantly in the different, centres 
where the mechanics were trained. These observations suggest 
that morale and willingness to work at so long and complex a 
course play a very large part. These are factors which may vary 
considerably in different centres, but which may also depend on 
the motivation of individual students. Hence it might be possible 
to measure them by means of a questionnaire. An educational and 
interests questionnaire was given to the group of 600, and the 
following items were found to be significantly associated with good 
course results: 

Staying on at school after the age of 16. 

Attendance at continuation or evening classes (not necessarily 
technical; banking, insurance and accountancy students were 
the best, presumably because they showed greatest “serious¬ 
ness of purpose”). 

Having passed any examination. 

Obtaining credit or distinction in School Certificate or Matricu¬ 
lation V. passing or merely reaching “School Certificate ' 
standard.” 

School Certificate or Matriculation in mathematics and/or 
physics. 

Ticking interests in metalwork, house repairs, radio repairs, 
electrical repairs, or photography. 

Among the passes 43 per cent, and failures 7 per cent, showed 
all five favourable educational signs, while 10 per cent, and 33 per 
cent, respectively showed only 2, 1 or 0 signs, corresponding to 
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a correlation of *42. Again, 66 per cent, and 29 per cent, respec¬ 
tively checked two or more of these interests, corresponding to a 
correlation of -22. These two factors and T2 were almost inde¬ 
pendent, and a scoring scheme based on all three gave the very 
promising multiple correlation of ‘es. ™ 



CHAPTER IX 

THE INTERVIEW AND ITS VALIDITY 

Abstract .—Hints are given on the conduct of employment or 
diagnostic interviews, with special reference to the contributions 
of Oldfield, Wilson and others. A number of the points are listed 
into which Service psychologists, psychiatrists and P.S.O.s 
enquired when attempting to assess the attitudes and emotional 
stability of recruits. Previous research shows considerable incon¬ 
sistency between^the judgments of different interviewers of the 
same man, and suggests that such judgments are often so biased as 
to have very poor validity. While it is admitted that the personality 
as a whole, can only be summed up py t&F**gltflical'**'ap^ 5ielrof 
an interviewer, there is e^dence^at the “psyc home tric** approa ch 
—^which meas ures partial facets of the p erson—may actually give 
be^r predictions of occupatio nai suitaKTiW. ^ 
investigations inThe Services tended to confirm the unreliability 
of interviewing, and to show that test results alone have better 
validity than the average P.S.O. interviewer. But there were large 
individual differences in the skill of different P.S.O.s, and the 
value of interview judgments by qualified psychologists and 
psychiatrists was generally high. It is pointed out that, in spite of 
its defects, interviewing is often more economical than an elaborate 
psychometric programme, and more acceptable to the persons 
undergoing classification and to their employers. / 


The technique of interviewing, and the value of conclusions 
based on interviews, have been widely studied by psychologists. 
Useful references include Anderson (1929), Bingham and Moore 
(1931), Symonds (1931), Strang (1937), Rodger, Nadel and Brown 
(1939). We shall summarise some of the main points briefly, and 
then consider in more detail recent contributions by British 
writers. 

(1) The large variety of types of interview, or of aims of inter¬ 
viewing, should be recognised, since they require different 
approaches and techniques; for example, the diagnostic , 
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interview (vocational or psychiatric), the treatment inter* 
view (psychotherapeutic or the giving of vocational or 
educational advice), the research interview for collecting 
information (the social worker’s, anthropologist’s, public 
opinion surveyor’s, and others). Interviewing _bv a bo ard ia 

1 generally considered more artificial, and m dis turbin g 
tK ^tHe ordinary two-person interview. 
j(2) Though mechani cal rules can not be laid down, the tech- 
niqlie or”mterviewing can be formulated and greatly 
iifipfoved Ty^almng.Tn tSwevyefsshouird also becar^lly 

s glccted . » .. 

(3) The main, qualities of a good interviewer, and the main 
factors leading to good rapport, are thorough knowledge of 
the jobs or other matters with which the interviewee is con¬ 
cerned and of topics in which he is interested, emotional 
maturity or a well-adjusted personality such that the inter¬ 
viewer is not shocked by anything the interviewee says, 
reputation among previous interviewees for sincerity, sym¬ 
pathy and sensitiveness. Good health and freedom from 
fatigue or strain are also valuable. 

(4) While many writers urge greater standardisation of pro¬ 
cedure, and while an oral questionnaire may be adequate for 
some purposes, e.g. public opinion research, most psycho¬ 
logists, in this country at least, insist on flexibility. They 
may have a scheme of points to cover, or hold a general plan 
in mind, but they prefer not to keep to any fixed order of 
topics since this makes for avykward switches. The 
interview is not an experimental situation; it is more effec¬ 
tively controlled and standardised by aiming at good rapport 
than, by using stereot 3 rped questions, 

(6) All available information should be ready, and should be 
critically studied, beforehand; for example, test scores and 
material which can appropriately be collected by question¬ 
naire. 

(6) Nervousness can be reduced by appropriate treatment of the 
waiting interviewee, and by suitable arrangements of the 
physical environment in which the interview is held. Equally 
important is a calm, unhurried approach, giving the impres¬ 
sion of plenty of time to spare. 

(7) Opening enquiries should be factual or otherwise designed 
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to put at ease. Topics likely to arouse shyness, guilt or 
inferiority should generally be approached indirectly. But 
the course of the interview must be adapted to the inter¬ 
viewee’s personality and may be very different with different 
people. 

(8) In general the interviewee should do most of the talking; 
at least there should be provision for spontaneity and initia¬ 
tive on his part. Information obtained by interrogation is 
known to be less reliable than spontaneous narrative. 

(0) A useful way of getting rational and impersonal information 
on traits and interests is to invite comparisons: “Do you 
prefer doing X to Y ? Are you like your brother (or sister) in 
Z?” 

(10) The taking of notes depends on the interviewee and the 
topic imder discussion. Notes should be legible and concise. 
It is of great importance to have records of factual evidence 
to back up any conclusions reached or recommendations 
made, and not to trust to memory. Information received and 
observations made should be carefully distinguished from 
interpretations. 

^(11) There is ample evidence th at any prejudice on the part of 
the intervi ewer may unwittingly affect the information he 
c ollects, or judgments h e fp dia ^iostrc'lnterweWm g 

( h e should particularly try to avoid being biased by on e or 
t wo outstanding merits or defects in the interviewe e, amd by 
hi g first impression of him, or by c han ce resemhia nc e to 
acquaintan ces, thus neglectin g many of the c omp lemti es of 
p erson SITty. ^ 

(12) T£Oieh^..in..£3liOTiaL.sa^ vf. .p?^?ond traits (pa;^^^ 

fe atur es, type of build , gestures, spee ch, dre ss, etc.) is 

I hifiJdj..d?luaiye,.--MaEmer-and„ex^ .may, 

i nd eed, be sign ifica nt, hut must be int erpr eted w it h ca ution 
since they m a.v a rise from different causes i n different people, 
and they are largely depen dent both on transient moods, e.g. 
temporal n ervo usness , ^d on cultural conven ti ons . The re 
is probably no relation whatever betw een a^ static facial or 
' bodily (haracteristic and any vocationa l s kill (apart ^o m 
cnp pling' a bhormaiiti es). 

,(13) In the diagnostic interview it is generally useful for the interr 
viewer to express' his main conclusions in the form of ratings 
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on a small nutaber~nfJimpo£taDL-and.jiiQr£-- Qr-Jp.RR jjpHa - 
p endartpersonality traits. Suchjiu antifi.cat iQnhelpstn pro- 
d uce a thoughtfol analysis of the ..eyiden££... and is vahiah lp 
i n follow -up.~Tlie ratings may be supplemented by ^'pen- 
picture of the person as a whole. 

(14) When an interview involves guidance, the interviewer 
should sum up the decisions reached at the close, and should 
ensure that the interviewee is set to take some appropriate 
action. 


Recent Discussions of Interviewing 

Among publications since 1939, Oldfield’s book (1941) is out¬ 
standing for its malysia of the steps by which interviewers reach 
their conclusions. In his view the aim of the employment, and 
many other types of, interview is not to collect factual information 
nor to make judgments from this information or other clues, but 
to stimulate the interviewee to display his “attitudes,” and to 
observe these attitudes. By attitudes he means consistent ways of 
reacting to situations or topics, or consistent types of behaviour, 
which arc more specific and more overt than the underlying 
personality traits. Such attitudes are partly inferred from expres¬ 
sive movements (maimer, speech, etc.) and partly from information 
about past history, but chiefly by raising various topics and seeing 
how he reacts, by observing the effects on him of sympathy, 
humour, agreement, disagreement, interest, surprise, etc. We thus 
get “the growth and continuously increased articulation of a general 
schematic picture of his (the interviewee’s) personality in the inter¬ 
viewer’s mind.” This schema or working model can then be 
imagined a.s fitting, or not fitting, into the prospective job. Gener¬ 
ally, however, it is implicit or unformulated and the psychologically 
untrained interviewer has considerable difficulty in putting it into 
words. He tends to subsume it under one or another of a set of 
stock types or frameworks which inevitably distort the complex 
personality structure. 

Wilson (1946) has described the methods he employs in inter¬ 
viewing candidates for high-grade employment or training, such as 
officer cadets, skilled mechanics and engineers. His main theme is 
that an interview which can last only about twenty minutes should 
be a strenuous interplay between the two persons concerned, not 
just a pleasant conversation leading to a number of intuitions. N6 
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questions should be so superficial that they can be answered merely 
by yes or no. The interviewer should consciously control and vary 
the “pace” of the interview. He should possess a clear idea of his 
objectives and should try to collect evidence under a set of headings 
such as those listed below, in terms of which Oldfield’s schema may 
be formulated and the suitability of different candidates compared. 
These do not constitute topics which are taken up seriatim-, rather 
the course of an interview consists of m oral autobiography, 
prodded by the interviewer, and any items may contribute data 
under several headings. 

(1) Effective Intelligence .—^What use has the candidate made of 
the ability shown by his test scores; what are the quality, 
pace and efficiency of his thinking and self-expression ? Can 
he analyse, abstract and generalise about novel and complex 
topics, or is his thinking merely reproductive, vague or 
muddled? Does he show intellectual assurance, clarity and 
fluen cy? 

(2) Technical Prowess .—^Just as effective intelligence differs 
from test scores, so here the interviewer tries to go beyond 

. the candidate’s academic or industrial record. What use can 
he make of his book-work? Does he understand basic ideas 
and can he apply them to new practical problems resource¬ 
fully ? Is his technical capacity confident, dilettante, original, 
or restricted ? ' ^ 

(3) Interests .—^Their number, range and depth, their stability 
or changeability, whether integrated or conflicting, and 
whether relevant to the proposed employment. The inter- ' 
viewer should conjure up the atmosphere of the job to see 
whether it may be distasteful. How did th? main interests 
arise; are they autonomous or based on some temporary 
identification ? 

(4) Motivation .—Soundness of reasons for desiring employment 
(or a commission), and strength of drives for undergoing 
long and tiresome training. Was his motivation steady in the 
past, showing responsibility, pride in achievement, desire to 
complete any task undertaken? Does he work to inner 
standards rather than outer demands ? Has he a useful degree 
of compulsive or obsessional tendency ? Does he possess a 
detailed and realistic knowledge of the proposed job or train¬ 
ing course, and has he taken the necessary preliminary steps? 
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With this are linked his energy or forcefulness and his 
doggedness or persistence in the past, viewed in the light of 
his vocational opportunities. 

(6) Dominance or competitiveness, i.e. his managerial and 
leadership potentialities. 

(6) AcceptoJ}ility to people he will work witli, under or over, 

including social and athletic accomplishments, freedom from 
annoying habits, conventionality of outlook, liking for com¬ 
pany. This is of importance even among such persons as 
skilled mechanics whose work appears to involve few social 
contacts. ' 

(7) Personal Attitudes .—^Whether co-operative or individualistic, 
irritable or impatient. Who is he influenced by or dependent 
on, e.g. what are his relations to his parents ? What are his 
reactions to the Service and authority, also to exerting 
authority ? 

' Numerous hints or clues are listed by Wilson whicli may throw 
light on these headings, or which should be distrusted. Most reli¬ 
ance should be placed on the candidate’s past record (as amplified 
orally) and on his test results. One feature of Wilson’s system that 
appears a little dangerous is his advocacy of infoi^mal achievement, 
aptitude and personality tests during the interview; for example, 
setting problems to bring out mental or technical ingenuity and 
pertinacity, observing whether he "crumbles” when the "pace” of 
the interview is too great, or whether his "dominance” comes ovjt 
in his attitude to the interviewer. There is so much evidence of 
the lack of validity, and of divergences in interpi'eting, such imcon- 
trolled teats that interviewers ^ould not generally be encouraged 
to use them. Valuable factors, which he also stresses, are the 
possession by the interviewer of sufficient information about any 
' arid every technical, cultural and athletic field to be able to probe 
the thoroughness of the candidate’s relevant interests, of adequate 
■ psychological knowledge to recognise abnormal motivations, and 
of familiarity with the "mythology” which candidates such as 
recruits hold regarding the advantages and disadvantages of 
various eriiployments. Finally, while Wilson prefers separate 
interviews by the psychologist, the medical officer and the technical 
examiner, etc,, he considers that candidates should not be 
accepted or rejected on the basisi of any one of them, but should be 
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discussed at a conference where the findings can be integrated, and 
they can be considered from all angles. 

An article by Misselbrook (1946) gives a useful picture of the 
interview at, as it were, a lower level, that is the P.S.O.s’ ten to 
fifteen minilte interview with candidates for the great bulk of 
Service employments where the effective use of the available time 
is of the utmost importance. He points out the value of the pre¬ 
liminary talk to each batch of recruits, of the information charts 
(the extent to which a recruit has studied these gives a useful index 
of the seriousness of his attitude to an emplo^ent), and particu¬ 
larly of the questionnaires as checked up by the recruiting assis¬ 
tants. If the interviewer is practised in “sight reading” such 
questionnaires, he or she can adapt a very flexible approach, since 
many of the essential points are already covered. Among other aids 
to the harassed interviewer are sets of oral trade questions which, 
if known by heart, can be applied informally: photographs which 
bring out the conditions of work in prospective jobs and often 
stimulate comm<^ts that reveal the candidate’s attitude; and 
trade test pieces which help to put the candidate at ease since he can 
talk about familiar, concrete things, and which bring out knowledge 
of methods of production, of standards of workmanship and so on. 

Follow-up research on the Admiralty questionnaire, together 
with suggestions put forward by Wilson and Misselbrook, enable 
us to list a number of easily recognisable “contra-indications." 
When several of these bad points are present in the same can¬ 
didate, they provide a poor prognosis for success at any responsible 
Service employment; Validatory investigations have yielded such 
uniform results in different jobs that we are entitled to regard 
“irresponsibility” as a general trait. It is the exception rather than 
the rule for a man who is a failure in one kind of work to find 
another kipd to which he settles down and manages really success¬ 
fully. The list includes: 

(1) Unsteady, unprogressive or' retrogressive work record; 

periods of unemployment considered in relation to district, 
also to ability. Thus, long-term employment at a low-grade 
job would be commendable in a very dull, but not in an 
intelligent, candidate. ; 

(2) Inability to give an intelligible account of his own job. 

(3) Post-war ambition below .the level expected from pre-war 
occupation and test scores. 
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(4) Failure to reach the usual form for his age and intelligence 
at school. 

(6) No further education in connection with occupation, or 
otherwise, since 14. 

(6) Dislike at school for mathematical and science subjects. 
Handwork, athletics, geography or “none” listed as the 
school subjects liked best. 

(7) Interests limited to the purely social and display type, e.g. 
jazz playing, the purely athletic, the entirely solitary, or too 
restless and migratory. 

(8) No positive indications of leadership and responsibility. 

(9) Evidence of marked pre-occupation with own health. 

Overlapping with this list is the series of questions, drawn up by 

naval neuropsychiatrists, for application by Wrens at recruiting 
centres, in order to bring to light potential cases of psychiatric 
breakdown. The questions were not asked in a stereotyped fashion 
but were introduced, often indirectly, as topics for discussion. 

(1) Lack of any sustained interests or hobbies, and avoidance of 
social or of physically dangerous interests. 

(2) “For what illnesses have you been off work during the last 
five years?” 

(3) “Has your health been good ? Have you had to go to your 
doctor a certain amount (or often) ?” The aim is to find how 
far minor ailments have been allowed to interfere with 
normal activities. 

(4) Digestion, dieting. “Do yoii have to be careful in what you 
eat?" 

(6) “How well do you sleep ?” 

(6) “What about your powers of endurance? Do you easily get 
tired?" 

(7) “Would you regard yourself as sensitive or highly strung? 
Have you ever had any trouble with your nerves ?” 

(8) Ipcoherence of employment record. 

A recruit whose illnesses interfere disproportionately with his 
career, or who drifts from one job to another for no good reason, 
or who shows others of these symptoms, was marked “P" by the 
Wren on his questionnaire—-that is positive signs of instability. 

Interviews by Army P.S.O.S had to keep more closely to the 
qualification form, since this had not been previously checked by 
recruiting assistants as had the naval questionnaire, also because it 
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was desirable to restrain their desire to employ hunches in prefer¬ 
ence to factual evidence. They were, however, instructed to sum 
up their conclusions in the form of rough ratings for the following 
qualities: 

(1) C.T. (combatant temperaftient). A low rating was given if 
the P.S.O. felt “apprehension about going into action" with 
the recruit, if his mode of life was unusually sheltered or 
restricted, or if he showed undue avoidance of vigorous and 
adventurous pursuits, and fear of serving abroad. (Such 
cases were referred to the psychiatrist.) A high rating was 
given to the outstandingly adventurous and aggressive. 

(2) E.R. (employment record). A high rating was given if 
employment (considered in the light of aptitudes and oppor¬ 
tunities) had been progressive, coherent and continuous; a 
low rating if the opposite. 

(3) L. (leadership). This was based on good, average or poor 
experience of supervising and directing others in his work or 
spare-time activities. 

(4) O.R. (officer reconamendation) and/or potential N.C.O. 
were also recorded. 

Little information is available as to the employment interviews 
conducted by psychiatrists either with the low-grades, Army officer 
candidates, or neurotics. However, Gillespie issued a list of factors 
to be considered by R.A.F. medical and psychiatric inter¬ 
viewers in assessing neurotic predisposition. These overlap 
to some extent with the points into which Service psychologists 
enquired: 

(1) Symptoms such as stammering, persistent enuresis, insomnia. 

(2) Neurotic fears—of the dark, loneliness, closed spaces, etc. 

(3) Unsatisfactory record at school and work, the latter often 
shown by frequent changes. 

(4) Comparative lack of interest in games involving bodily risks 
—boxing, diving, climbing. 

(6) Inadequate reasons for joining R.A.F. or air crew. 

(6) Response to sight of blood, e.g. fainting. 

(7) Visceral responses before some ordeal such as an examina¬ 
tion, important match or crucial interview. 

(8) Family background, history of mental and nervous illness, 
"broken home.” 

(9) Reactions to fighting at school. 
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(10) Food, alcohol and tobacco habits,excessively temperate or 
intemperate, or hypochondriacal. 

(11) Temperament; Sociable or unsociable; obsessional traits; 
hysterical (dramatic, immature, suggestible, dodging); 
psychopathic (lack of persistence, extreme irritability, gener¬ 
ally poor personality); depression; anxious; narcissistic 
(conceited, flamboyant, self-centred). 

(12) Persistent air-sickness, headaches, etc., during training. 

Reliability and Validity of Interview Judgments 
Investigations of the employment interview described by Holl- . 
ingworth (1929), Hartog and Rhodes (1925) and others suggest 
that there is so much divergence between the views of different- 
interviewers of the same candidates, that the technique is prac¬ 
tically worthless except for such restricted purposes as a business 
man choosing his own secretary. Several experiments, however, 
have shown much the same reliability for interview judgments as 
for ratings by friends and acquaintances, namely, correlations of 
between *6 and -B. Thus, in Webb’s (1916) investigation, when 
schoolboys were interviewed only for a minute each by two inter¬ 
viewers separately, the agreement between their ratings on several 
traits was ’SI, In Magson’s (1926) research, boards of five inter¬ 
viewers, including business and professional men, assessed the 
intelligence of student teachers from interview with an average 
inter-correlation of •62. Here one-interviewer did all the question¬ 
ing, the others listened. Additional judges who only read a tran¬ 
script of the conversation correlated -BB with the interviewers. 
Fearing (1942) studied ratings of police officer candidates by 
boards of four judges and found correlations of ‘23 to *48 on ten 
separate traits. But for the total grading based on all the traits the 
mean agreement was -69. Several investigations of public opinion 
surveys show high, but far from perfect, agreement when a second 
interviewer a^ks the same questions shortly after the first one (cf. 
Cantril, 1944; McNemar, 1946). On straightforward, codcrete 
questions such as ownership of a motor-car or telephone there was 
approximately 90 per cent, agreement, and on ratings of age 71 per 
cent, of identical judgments, corresponding to a correlation of *91. 
But with ratings of more complex qualities such as economic 
status, only 64 per cent, of judgments were identical—a correlation 
of -63. Perhaps the most striking study was that of Newman, 
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Bobbitt and Cameron (1946), who interviewed 636 U.S. coast¬ 
guard officer candidates and tried to assess the very comply 
quality of ability to pass training and to make a good officer in 
action. One psychiatrist’s judgments correlated -81 and -86 with 
independent interview judgments by two psychologists. This 
suggests that psychologically-trained interviewers who have 
reached a clear and agreed conception of what they are looking for 
can achieve very satisfactory reliability. 

Investigations of validity are not very favourable. Webb obt^ed 
a correlation of -63 between his interviewers’ judgments and judg¬ 
ments by teachers who knew the boys, but Magson only found a 
correlation of -18 between interviewers and fellow students, and 
•12 between interviewers and results of intelligence tests. Possibly 
the conception of intelligence among adults is particularly equivocal. 
Clark (1926) showed that two interviewers could estimate the col¬ 
lege grades of students with validities of -66 and -73. But even 
these predictions are little if any better than might be obtained 
from a battery of aptitude and achievement tests, and from previous 
gradqs. The data reported by Bobbitt and Newman (1944) on 
1,900 cases yield tetrachoric correlations of '49 between combined 
interview judgments and passing the officer cadet trdning course. 
Test scores alone (which were known to the interviewers) had a, 
predictive value of '47. The combination of interview judgments + 
tests gave the best validity, namely, *66. Anderson (p29) presents 
a number of tables showing the relevance of qualities assessed m 
a psychiatric interview to success at salesm^ship and other com¬ 
mercial appointments, and claims that routine selection 
which omitted this interview were less diagnostic. Sarbm 
on the other hand, quotes studies by Wittman and himself in which 
predictions derived from batteries of tests or other factiml items 
were superior to clinical judgments based on internet. Thus, m 
one research oh the prognosis of schizophrenic patients, an appro¬ 
priate combination of items ,from a rating scale on present symp¬ 
toms gave better correlations with subsequent outcome than did 
predictions by psychiatrists. Similarly, in a study of coUege 
students, two tests (whose multiple regression equation had been 
established in a previous group) correlated more kghly man mter- 
view judgments with academic achievement. This would mdicate 
that suggestion, halo and bias, faUacious inferences from external 
signs and the like' play so large a part even m professional inter- 
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viewing that the inclusion of an interview may distort rather than 
improve vocational procedures. Conrad (1947) reaches the samA 
conclusion. 

The controversy between the “clinical” and “psychometric” 
approaches to the study of personality is a perennial one. Allport 
(1942) contrasts the “nomothetic” and "idiographic” viewpoints 
in psychology. He claims that the objective and analytic methods 
of the former disrupt the integrated structure of the personality, 
and that the more intuitive idiographic study of the individual 
person as an organised whole makes for better “understanding, 
prediction and control.” Hence the latter approach is no less 
scientific than nomothetism. Vernon (1936) describes experiments 
by von Bracken, Cantril and himself showing “that personality can 
be more accurately and consistently judged as a structured whole 
than can the sepaihte traits or components into which it may be 
analysed.” Similarly Polansky’s (1941) investigation proves that 
better understanding of people and prediction of their actions may 
be derived from “8tructural”< case studies than from lists of test 
scores. These results do not, however, affect the present issue— 
namely, that clinical judgments of the personality as a whole, based 
on interview, may be so unreliable or so distorted that better pre¬ 
dictions of suitability for a particular vocation may be obtained 
from properly weighted combinations of psychometric data. Yet 
another aspect of the same controversy occurs in public opinion 
research. It seems very plausible that the free interview and obser¬ 
vation techniques used by Mass Observation provide a better under¬ 
standing of people’s attitudes than do answers to a stereotyped list 
of questions, but it has yet to be proved that subjective factors do 
not play an rmduly large part in Mass Observation’s collection and 
interpretation of its evidence (cf. Durant and Harrisson, 1942). 

As long ago as 1926 Viteles argued that an adequate picture of 
occupational suitability cannot be obtained from test scores or 
objective items of past history, and that the clinical approach of the 
trained psychologist is needed to take account of the other non- 
measurable factors and to integrate them into ah over-all view of 
the candidate. Freyd (1926), however, maintained that the relevance 
of any such additional factors should be systematically investigated 
and validated in the same way as that of test scores. Wallin (1941) 
and Sarbin (1944) again state that the interviewer is really doing * 
the same thing as the psychometrist—^that is, combining the indi- 
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cations of various items which he believes, on the basis of past 
experience of similar cases, to be predictive of success or failure, 
into a total judgment of suitability; but that he is doing it inform¬ 
ally and subjectively, and, therefore, less effectively. 

Clearly it is incumbent on psychologists who regard the inter¬ 
view as a valid instrument of selection and guidance to produce 
experimental evidence. Members of the N.I.I.P., whose views 
accord with those of Viteles, have indeed proved the validity of 
their vocational guidance which largely involves interviewing, but 
have not demonstrated that they could not predict as effec^vely 
without it. Yet another important query that we must try to aiiswer 
is whether an interview by a trained psychologist or psychiatrist 
is superior to one by an ordinary industrial employer or an experi¬ 
enced officer in the Forces. 

Reliability of Service Interviews 

No data are available comparing two independent interviews by 
psychologists or P.S.O.s except those listed in the previous 
Chapter (Table VI), but there is evidence of considerable varia¬ 
bility in standards of judgment. Thus, the overall proportion of 
naval recruits marked “P” (unstable) by Wren recruiting assistants, 
who interviewed some 80,000 men in 1943, was 2-69 per cent. But 
the range of 101 different Wrens (each of whom interviewed 300 to 
1,200 men) was from 0 to 12-9 per cent., and the quartile range 
0'6 per cent, to 3'6 per cent. Each Wren had a fairly steady rate of 
her own, the correlation between the “Ps” in the first and second 
halves of the period being '79. It was also found that Wrens judged 
by their supervisors to be high on such traits as carefulness, steadi¬ 
ness, conscientious, reliable, experienced, and teachable, marked 
the largest numbers. The rate varied considerably with area—^that 
is, all the Wrens working , in a given area had' a common 
policy, but there were significant individual differences over 
and above this. 

Such results should not be regarded as a serious criticism of the 
Wrens, since, owing to the impracticability of arranging psychia¬ 
tric interviews of “P” men, little stress was laid on this rating. In 
H.M.S. Royal Arthur, P.S.O.s were encouraged to refer all doubt¬ 
ful cases to the psychiatrist, and among nine of them, each of whom' 
interviewed 600 to 1,200 recruits during 1944, the referral rates 
ranged from 2*8 per cent, to 9*7 per cent., the total rate for over 
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8,000 recruits being 6*3 per cent. The psychiatrist classified these 
referrals as follows: 

6-7 per cent, were considered as suitable only for immediate 
discharge on psychiatric grounds. 

17"8 per cent, were judged likely to give six months’ service or 
less before breaking down or showing criminal tendencies. 
49’6 per cent, were borderline, being likely to give at least a 
year’s service; they were, however, definitely thought worth 
referring. 

27-0 per cent, were found to possess no psychiatric abnormality. 

These figures suggest fairly dose agreement between the P.S.O.s’ 
and psychiatrist’s judgments of instability, though in the absence 
of a control group of men not referred, no exact correlation between 
them can be calculated. The numbers in the first two groups 
referred by different P.S.O.s also varied from 11 per cent, to 37 per 
cent., and it was noted that, on the whole, P.S.O.s who sent the 
biggest total numbers also picked the biggest proportions of these 
serious cases. In the Army, P.S.O.S referred 14 per cent, of recruits 
to psychiatrists in a representative period, but their variability was 
not diecked. Large variations were found, however, in the pro¬ 
portions of officer recommendations made by different 
P.S.O.S. 

Evidence of the reliability of psychiatric interviews in the R.A.F. 
is given by Hill and Williams (1947) in a study of 541 cases who 
were seen by more than one of thirty-seven psychiatrists at 
different centres, at intervals ranging from several days to a' few 
months. Among the items noted by the psychiatrists were diagnosis 
or “reaction type” under one of six headings, and degree of pre¬ 
disposition to neurosis—severe, mild or none. Of the diagnoses 
81 per cent, were in exact agreement, but the figure varied con¬ 
siderably in different reaction types. If the average agreement in 
four main types is taken—anxiety, depression, hysteria, and others 
, —it drops to 68^ per cent., corresponding to a reliability coefficient 
of ’76*. The reliability of judgments of neurotic predisposition was 
somewhat inadequate, corresponding to a tetraclioric correlation of 
■58. Nearly half the men assessed as “severe” by one psychiatrist 
were caljed “mild,” or occasionally “nil” by the other. Further 
judgments, such as degree of flying or other stress, showed inter¬ 
mediate degrees of agreement. 

* This was calculated by Butt's (1946) matching formula. 



THE INTERVIEW AND ITS VALIDITY 167 

Validity of Recruiting Assistants’ and P,S.O.s’ Intetvie^vs 

The ability of Wren, recruiting assistants, using the questions 
listed above, to pick psychiatric cases was studied in an experiment 
where four Wrens interviewed 147 actual psychiatric in- or out¬ 
patients mixed with 176 controls (Curran and Roberts, 1946). On 
the basis of their enquiries into pre-Service history only they 
picked 68 per cent, of the cases ,and 16 per cent, of the controls as 
suspect. But it was later discovered that of the 16 per cent, “false 
positives,” most of whom were seen by a psychiatrist, over half did 
actually possess some considerable degree of abnormality. The 
Wrens missed a rather high proportion of the affective psychotics 
and neurotics, but managed to pick up twenty of the twenty-four 
hysterical patients. Obviously tliis experiment could not prove that 
the Wrens would have been equally' successful iii diagnosing 
breakdovm before it occurred. Another weakness was that the 
four Wrens, though not necessarily the best available, were known 
to be generally good at their work, and others might not have done 
so well. It should also be noted that their success was not any 
greater than that of the various personality inventories which 
have been tried out as screening tests (cf. Chapter XV). 

Variations in success of interviewing were shovm when 1,631^ 
naval radio and electrical mechanics and wiremen, who had been 
selected by thirteen different P.S.O.s, were followed up. The 
failure rates during their training averaged 24 per cent., but ranged 
from 18 per cent, among men selected by the most successful 
P.S.O. to 37 per cent, for men selected by the least successful. This 
study, however, is incomplete, since we do not know how many 
potentially suitable men were missed. Conceivably the P.S.O. of 
whose selectees only 18 per cent, failed vns more “choosy” than 
the rest, and had she accepted larger proportions of her inter¬ 
viewees her failure rate might have risen. In an analogous investi¬ 
gation in the Army, nearly 3,000 W.O.S.B. candidates were 
“followed back” to the P.S.O.s (or other officers) who had origin¬ 
ally recommended them. Although the numbers interviewed by 
any one P.S.O. were rathet small, it was possible to show that there 
were significant variations among P.S.O.s both in the proportions 
of suitable recruits whom they recommended, and in the failure 
rates of their recommends at the W.O.S.B.s; also that the more 
"choosy” P.S.O.s did not necessarily pick better candidates. 

In .seven investigations it was possible to correlate both test 
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results and P.S.O. judgments of suitability, based on interview, 
questionnaire and tests, with subsequent training course results or 
proficiency grades, and in only one of these.did the judgments 
show better validity than some of thb tests. Thus, among 730 Army 
personnel trained as drivers the correlation between P.S.O.s’ 
assessments and pass or fail was -234, while the validities of the 
Bennett test (S.P.2) and age (youthfulness) alone were -294 and -288. 
Gradings by a P.S.O. with engineering experience were available 
for 1,614 boys starting training as Army tradesmen; compared with 
practical marks over the next 1-3 years their validity was -272, 
those of tlie Bennett and Squares (S.P.4) tests being *294 and -288. 
However, the gradings were not intended to predict general suit¬ 
ability for trade training so much as to advise the training schools 
on the beat type of course for each boy, hence this comparison may 
not be entirely legitimate. 

The final course results of 411 naval radar plot operators were 
collected (cf. p. 244), and the correlation of these with P.S.O.s’ 
assessments was *374. The correlations for T2, a scale reading test 
and a graph reading test were *416, *430 and *400 respectively. If 
T2 is held constant, the partial correlations with the criterion, i.e. 
the extent to which the P.S.O.s or the other tests can improve on 
T2 are 'IS? for the P.S.O., *172 and *193 for scale and graph 
reading. It should be noted that in this instance the P.S.O.s had 
rejected a (not very large) proportion of recruits as unsuitable for 
training; hence those that did go forward were more highly 
selected on P.S.O. judgments than on T2, also more selected on 
T2 than on the other two tests. Probably if allowance could be 
made for this selectivity, T2, P.S.O. and scale reading would all 
show almost the same validities. 

Table XXXIX (p. 246) lists the correlations of P.S.O. judgments 
of 601 radio mechanics and 342 electrical mechanics with success 
or failure on course, together with the validities of several S.P. 
tests among some 300 and 200 of these men. Again the P.S.O.s tend 
to be surpassed by T2 and by information tests which are directly 
relevant to the jobs. But here, too, selectivity imposes a greater 
handicap on the P.S.O.s than on the tests. Similar investigations 
in the U.S. Navy yielding equally unfavourable results are 
described by Stuit (1947). 

The most successful judgments were made by one highly experi¬ 
enced P.S.O. She graded 460 men who had been sent on course as 
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safety equipment ratings, without seeing either the men or their 
test scores, but only from their questioimaires. It should also be 
noted that the relevance of particular questionnaire items had not 
been studied in any previous follow-up investigation. Neverthe¬ 
less, the correlation between her Judgments and subsequent course 
marks was ’664, whereas T2 alone gave a correlation of *393 and 
no other combination of the standard tests would have improved 
appreciably on this. Her correlation with T2 was *312, hence it 
was clear that she was not merely assessing general ability from 
previous education and occupation, but was effectively judging the 

character traits and interests relevant to the job. 

\ 

Validity of Psychologists* Interview Judgments 

The superiority of psychologists’ interview judgments both to 
judgments by a board of naval officers and to predictions based on 
intelligence tests was indicated in a study of 503 R.N.V.R. cadets. 
The great majority of these were seen by Wilson, who used the 
methods outlined above, and the remainder by other Admiralty 
psychologists. T2 scores were available, but not the reports, 
specially prepared for the Admiralty Board on the candidates’ 
previous record in the Navy. The boards, on the other hand, had 
as well as these reports, the psychologist’s reports and T2s. Only 228 
candidates were allowed to proceed for training by the boards and of 
these 61-4 per cent, passed. Of the 176 favomably graded by the 
psychologist 71 per cent., and of the fifty-two unfavourably graded 
29 per cent., passed corresponding to a tetrachoric correlation of 
"67. As none of the candidates unfavourably regarded by the board 
were sent for training, its predictions cannot be expressed as a 
correlation, and the two percentages 014 and 71 are not compar¬ 
able. But if all candidates had gone forward, the superiority of the 
psychologist would almost certainly have been more marked*. 
Predictions within this group of 228 on the basis of T2 alone gave 
a validity coefficient of only -24. 

In a small W.O.S.B. investigation of officers who suffered 
psychiatric breakdown, psychiatrists, psychologists, tests, and non¬ 
technical board members were compared. Eigh^-nine cases were 

* Since the agreement between the hoard's and psychologist’s jud^mta 
corresponds to a correlation of *71, it follows that the psychologist's predictions 
were validated only in a highly selected group. Thus in a less selected group 
of candidates his validity coefficient would certainly rise. The low validity of 
T2 is also largely due to selectivity. Among an earlier, unselected, group of 
41d candidates its validity was ‘70, 
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traced who broke down within 1J years after passing their boards 
and the judgments made at the time of boarding were looked up. 
Neither their final board gradings nor the president’s, or M.T.0.8’ 
opinions of them were any poorer than of normal acceptances. The 
psychiatrists had recommended the acceptance of 71 per cent, of 
them, but had given a significantly larger proportion of adverse 
reports than to normal candidates. The psychologists (sergeants) 
had not seen the cases, but had studied their “pointers” material 
and had actually recommended the rejection of a larger proportion 
than the psychiatrists, namely, 48 per cent. Since, however, the 
psychologists tended to give more adverse reports on normal can: 
didates also, the validity of their predictions was probably much 
the same. Intelligence test results were also sub-normal, 14 per 
cent, instead of the normal 6 per cent, having an O.I.R. of 4 or 
below. No one would expect such tests to predict breakdown, but 
in this investigation numbers are insufficient and the data too 
incomplete for it to be proved that psychiatrists are better than 
psychologists, or vice versa, or that either are better than tests. All 
three, however, are significantly better than “lay” interviewers or 
observers. 

Validity of Psychiatric Judgments 
In 1941-2 several experiments were carried out by T. F, 
Rodger, Wittkower and other psychiatrists into the assessment of, 
officers who were attending the Company Commanders’ School 
near Edinburgh, The commanding officer and his staff were able 
to provide exceptionally thorough descriptions of the personalities 
and future promise of these officers at the end of their training. A 
psychiatrist interviewed each man for an average time of just under 
an hour, and wrote an independent diagnosis. On comparing the 
psychiatric and the school’s opinions close agreement was claimed 
in eighty-fiveoutof 100 cases. In a similarstudy of223 officers certain 
intelligence tests were also given—Group Test 33, Matrices and/or 
Army S.P. Test Id. On matching the two sets of reports closeagree- 
ment was claimed in 62 per cent., substantial agreement in 36 per 
cent,, some discrepancy in 6 per cent, and divergence in 6 per cent. 
This matching, however, was subjective and it is impossible to say 
what degree of objective correlation ,it represents. It was shown 
. that there were no significant differences in the agreement rate for 
the three psychiatrists who did most of the interviewing. 
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Elsewhere Bowlby carried out a similar, but better controlled, 
investigation of thirty-six O.C.T.U. cadets, to whom he gave the 
F.H.R. test and Matrices and a twenty-minute interview. He 
predicted their actual O.C.T.U. passing-out grades correctly in 
twenty-eight cases, and was seriously incorrect in only two cases. 
At that time, however, the cadets were very heterogeneous, having 
been selected by old-type interview boards, and it was found that 
the intelligence tests alone would have caught ten of. the thirteen 
men who failed training. Thus, both the psychiatric interview 
and the tests achieved validity coefficients of around ’SO. 

In 1943-4 psychiatrists took part in the selection of Army 
parachutists and, in order to gain greater familiarity with the job, 
themselves underwent parachute training. Brief interviews and a 
shortened version of the W.O.S.B. “pointers” procedure were 
used. The trainees were graded 1 to 6 for suitability. On following 
up 1,492 cases, of whom 20 per cent, failed their first stage of 
training, it was found that 3 per cent., 7 per cent., 10 per cent., 
23 per cent, and 46 per cent, of men in the psychiatrist's five 
grades failed. This corresponds to a validity coefficient of *68. But 
on following up a year later the agreement between the grades and 
either wastage (on medical, disciplinary or training grounds), or 
promotion to N.C.O., was less close, corresponding to correlations 
of between -2 and -3 (cf. Expert Committee, 1947). 

A small experiment in the R.A..F. is reported by Rollin (1944). 
Twenty-five W.A.A.F.S who failed trade training for reasons other 
than dullness, and 100 normal controls were interviewed, and use 
was made of Gillespie’s scheme (p. 161). The most diagnostic 
symptoms appeared to be; 



Failures 
per cent. 

Controk 
per cent. 

History of instability in family 

Nervous symptoms before enlistment (morbid fears, de- 

44 

6 

pression bouts, etc.) 

Previous nervous bre^down 

66 

6 

28 

3 

Invalidism suggestive of neurosis .. 

Personality diEEiculties (over-aggressive, passive or inade- 

24 

4 

quate) . 

28 

4 

Undue timidity, shyness, over-dependence or worrying .. 

24 

9 

Unsettled home (e.g. due to unemployment) 

16 

3 

Occupational instability 

8 

1 

0 


' This analysis is, of course, ex posifacto, and its validity could only 
be proven if it was tried out on fresh cases before their training 

P.S.—6 ' 
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Started. The relation of neurotic disposition to operational effici¬ 
ency was shown very clearly by Reid. Either the medical officer or 
Reid liimself interviewed 200 airmen on arrival at a R.A.F. station, 
and they were followed up through a tour of operations. Two or 
more neurotic symptoms were found to occur in 20-4 per cent, of 
those who had accidents or became casualties, and in 69 per cent, 
of those who developed psychiatric breakdown, but only in 6'4 per 
cent, of those who successfully completed their tour. This corre¬ 
sponds to a tetrachoric validity coefficient of + ‘BS. Finally, a large- 
scale trial of the predictive value of psychiatric interviews was 
carried out in the R.A.F. towards the end of the war. Of the 500 
men seen by two psychiatrists, only a few could be followed to the 
operational stage. The main finding was the extent to which 
the validity of the judgments depended on the personal flair of the 
psychiatrist (Williams, 1947). 

Conclusions 

While none of the experiments summarised above provides a 
definite answer to our problem—whether the “clinical” approach 
of the interview yields better predictions than does the psycho¬ 
metric approach based on an appropriately weighted battery of 
tests and factual biographical items—^yet thw do showjJiatJntgf- 
\views by trained psychologists and psychiatrists possess conside r- 
YbIe"’^i3i^7W[dreovw, l ^en~affiequate" consideration has b epn 
gi ven to "tHe qualities to be assessed and the method s u sed, a nd 
when a certain amount of objective information is availa ble as a 
s tarting point, tlTe consistency or reliability of judgments by su ch 
tr ained p ersonnel is re asonably satisfactory . At the same time' the 
correcuiess of the predictions never approaches anywEefeheaf lOO 
per cc hf;',~ except peni ^s, wh5a me candidates are abnormall y 
heterogeneouiTTfls’ noteworthy also that even professional inter¬ 
viewers probably do not allow sufficient weight to objective factors. 
Although they consider that they have taken test scores and the like 
into account, b etter p red ictions are often obtained from a combin a¬ 
ti on of their finaTiudgmehts with test srores than from either alone . 

Psychological and psychiatric interviewers are almost certainly 
superior to experienced but psychologically untrained persons such 
as employers and officers in the Forces. It is doubtful whether 
interviews by P.S.O.s (even when carefully selected and trained) 
constitute a valid vocational technique.'Probably some P.S.O.s are 
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quite as good as psychologists and psychiatrists, but there are such 
large variations that it would seem wise to keep their diagnostic 
functions to a minimum and to rely to a greater extent on psycho¬ 
metric predictions whenever possible. None of the evidence that we 
have collected suggests that their subjective summing up of past history 
and their personal judgments are capable of covering all the relevant 
factors in vocational classification omitted by tests. 

As shown in the previous Chapter, t he interview is essential f or 
c hecking the correctness of questionnaire entries, fo r giving info r¬ 
mation, for matching supply and dem and, for imp rovin g morale*, 
an d often because it is a vested intere st. Doubtless it has valuable 
applications also in treatment, counselling, research, etc., and in 
providing hypotheses as to the occupational relevance of various 
symptoms—^hypotheses which should be subjected to empirical 
verification. But we arp. forced tn rnn^lndp-tbatAts valna as a Hiapr- 
nostic technique is, as Hunt, Wittso n and Harris (1940) put it, 
a jnatter of economics . Although thorough p^3mmet5[c~^o- 
cedures might usually be more accurate they would involve such 
extensive and complicated research, and be so costly to apply, that 
they may be impracticable, in addition to being unacceptable both 
to candidates'and to employers. T he interview -haalhe-tcemendous 
a dvantages of inclusiveness. 3pee3~and flexibility , ft takeft ipto-con- 
si deration a much wider range of factors than can rea dily be tested 
( though admittedly many of these are irrele vant and moat of them 
, li able to misinterpretation). It re quire s no apparatus, not even 
printed blanlrs. in the U.S. Navy, according to Hunt, interviews 
averaging only three minutes in length, served for screening 
psychiatrically suspect recruits. It was found that a single psychia¬ 
trist could deal with 100 cases a day, though many could not 
keep up this pace for long. Even with ten- to twenty-minute 
sessions, an interviewer is likely to cover fifty men in no more time 
than a tester would take to give, score and weight the appropriate 
score combinations on, say, a two-hour battery, which would have 
required months of technicians’ time to prepare. True, these inter¬ 
views might make more mistakes than the objective procedures, 
but the sufferers (whether employers or employees) tend to resent 

* For example, Closson and Hildreth (1944) have shown that recruits who 
receive a five-minute psychiatric interview, designed to discover their “weak 
spots” and to give advice accordingly, make better adjustments in the Forces 
than comparable men who have merely been selected by impersonal test 
methods. 
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errors of human judgment much less than they do errors made by 
Impersonal instruments. 

The interview again can readily adapt itself to the exceptional 
ease, e.g. the illiterate whose test results or questionnaire responses 
are quite misleading. It can be applied in employments which 
involve small numbers, where adequate validatory evidence for 
psychometric procedures is unobtainable, and in such fields as 
managerial, officer or air crew selection, where it is particularly 
difficult to devise tests of character or temperamental qualities. 
Finally, when the requirements of any employment alter, .the inter- 
. view can be adjusted at once, whereas the psychometric approach 
involves a complete re-investigation of the validity of all the poss¬ 
ibly relevant test scores and biographical items, a fresh statistical, 
analysis and preparation of a fresh weighting system. As we have 
seen, such re-investigation cannot properly be carried out on a 
selected group; it is hardly feasible to work out minor adjustments 
in a batteiy while that battery is in actual use for selection. For all 
these reasons, the interview rather than tests must remain the 
prime instrument of vocational classification, and vocational 
psychologists are justified in spending as much of their time on 
attempts to improve the technique of interviewing as on devising 
more valid tests. 



CHAPTER X 


PRINCIPLES OF PSYCHOLOGICAL TESTING 

Abstract.—A. psychological test, by presenting a standardised task 
or situation, elicits a sample of the testee's behaviour vtrhich can be 
objectively scored and compared with norms of performance, and 
which has been proved to be predictive of future occupational or 
other behaviour. To regard it as measuring particular traits or 
abilities (the "naming fallacy”) is often, misleading. Its content is 
best determined by a combination of systematic introspection, 
empirical validation, and factor analysis. The types of testa foimd 
most useful in vocational classification are group paper-and-pencil 
tests for measuring the factors underlying a large number of jobs, 
information or trade knowledge tests, and supplementary "minia¬ 
ture situation” teats, each closely analogous to some important job. 

Given a standard battery of group tests, the planning and con-', 
structiorTof new tests is guided not so much by job analysis of the 
aptitudes involved as by gauging how far this battery needs to be 
supplemented, and by considerations of practicability, i.e. the 
availability of materials and time for devising, giving and scoring, 
etc." The iri^ steps in such construction are outlined. Adult 
tests are generally calibrated (provided with norms) in terms of 
percentile levels, rather than in mental ages, l.Q.s, or "critical 
scores.” There are great difRculties in indicating the relevance of 
scores, especially on a battery of tests, to particular jobs. So-called 
"wastage coefficients” and minimum standards are unsatisfactory. 
Tables or histograms which show the optimum score levels on the 
most valid tests, and the probability of job success at each level, 
are advocated. 


The fundamentd method of science is observation of the way 
things (living or non-living) react in specified situations, and this 
is the method of psycholopcal testing. The tester does not deduce 
people’s tfiuts or abilities from bumps on their skulls or from their 
physiognomy, nor does he rely on hunches or subjective impres¬ 
sions, but records their reactions to a standard task. At the same 
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time a test is in. no sense a miraculous instrument for revealing 
jotherwise unsuspected qualities; it is simply a refinement of the 
process of observing behaviour which we use every day in making 
'judgments about our acquaintances. As pointed out by Vernon 
(1940) this refinement involves: 

(1) Standardising the task or situation, so that all testees react 
to the same task. 

(2) Objective and unbiased recording of the testees’ perform¬ 
ance, usually in quantitative terms. 

(3) . Availability of norms or standards by comparison with 
which the goodness'or poorness of performance can be 
assessed. 

(4) Demonstration that the test is both reliable—^i.e. a precise 
measure, and valid—^i.e. that it actually measures or predicts 
what it claims to do. 

Certain other points should be noted: 

(6) This view of tests includes intelligence or educational testa, 
where the recorded behaviour consists of answers to intel¬ 
lectual problems, as well as practical vocational tests, also 
medical tests of eyesight, hearing, etc., which may have a 
bearing on vocational suitability. It does not cover so-called 
projection tests or other clinical techniques whose object is 
to provide the psychologist with insight into the testee’s 
personality (cf.'Thematic Apperception, p. 67), or to allow 
him to judge the testee’s methods of work (cf. Cube Con¬ 
struction, p. 32). Such instruments are valuable in skilled 
jhand^, but are not strictly tests, unless objective scoring 
, ^ethods are applied. 

(6) A test constitutes only a relatively brief sample, or set of 
samples, of behaviour. To make sure how well a man will 
' do in a rertain occupation we should need to let him try it in 
. ■ real life, but because this would take so long we employ the 
.short cut method of tests. It naturally follows that no test, 
nor even an appropriately chosen battery of tests, will give 
perfect predictions of proficiency at the occupation. Both 
' tlm sample behaviour elicited by die test and vocational pro- 
fiaency itself depend on a multiplicity of factors, so that 
there cm be only a certain statistical prpbability that testees 
obtaining sudi and such scores will succeed or fail. If the 
test has good validity the odds may be very high, but there 
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can never be complete certainty about any particular 
individual. 

(7) So far as possible a test should be named in terms of the 
1 operations it actually involves, rather than in terms of 
'■ / traits or abilities which it is supposed to measure. It is 
scarcdy possible to discuss of analyse vocational suitability 
without bringing in personal qualities like intelligence, 
attention, dexterity, persistence, etc.; but we should realise 
that these are hypothetical constructs. Such terms, , which 
are intermediate between the observed test behaviour and 
the predicted future behaviour, are often unnecessary and 
misleading. The “naming fallacy” has been the bane of 
vocational psychology, witness the ubiquitous use of 
“manual dexterity,” “visual’ discrimination,” “attention to 
details,” and the like. Factor analysis investigations indeed 
show that there is so litfle"6veflapping between different 
teats of dexterity that it is doubtfol whether it should be i 
credited with any general and independent existence. Such 
analysis does, however, suggest that certain abilities are 
sufficiently consistent and distinctive to constitute useful i 
concepts for the psychologist’s vocabulary, and these are 
usually denoted by letters such as g, k, or v, rather than by 
names of dubious significance such as intelligence, spatial or 
verbal ability. 

There are, in fact, three methods of deciding what a psycho¬ 
logical test measures, all of which are usually employed: 

(i) The method of a priori or theoretical analysis. Inspection of 

a test merely gives us what is called its “face value,” which 
has little or no scientific worth. The collation of systematic 
introspections about the test from several trained observers 
may be more useful, but even the views of psychologists as 
to the physical or mental processes involved in taldng the 
test may be invalid when the prospective testees are children, 
or adults of sub-average intelligence. Chapters XII-XV 
give examples both of tests which failed to measure the 
abilities anticipated, and others which measured unsus¬ 
pected abilities. I 

(ii) Follow-up evidence, i.e. the comparison of test results 
with success or failure in various jobs. Though this is 
the most scientific method, it too has its weaknesses—the 
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difficulty of securing reliable criteria of proficiency, the 
problems introduced by dealing with selected groups, and so 
forth. 

(iii) Factor analysis, which indicates the main types of ability 
involved in a test by analysing its correlations with other, 
partially overlapping, teats. When representative groups of 
adults are tested, ranging from the very highly intelligent 
and well-educated to menlally-defective level, it is found 
that a general or g factor enters to a large extent into prac¬ 
tically all tests. Next the majority of tests fall into one or 
other of two contrasted types—^the verbal, numerical and 
educational type ; ed factor) and the practical, mechanical 
and spatial type (k : m factor). In more detailed investiga¬ 
tions, these types break down into numerous sub-types or 
group factors. The former includes distinguishable, though 
usually overlapping, verbal, numerical, clerical and 
secondary school (as contrasted with primary school) 
abilities. The latter group includes mechanical, spatial, 
informational, physical, manual, and other abilities. The 
extent to which a test depends , on any of these abilities, i.e. 
its factor. loadings or saturations, may be expressed by 
correlation coefficients*. Such figures should, however, be 
accepted only with the greatest caution, since they are 
largely dependent on the heterogeneity or range of ability, 
in the testees studied, and on the particular set of tests 
chosen for analysis. Further investigation might frequently 
reveal the presence of additional group factors, and, as 
Spearman showed, every test measures something specific 
to itself. 

Varieties of Psychological Tests 

Several classifications of tests have been proposed, and it will be 
useful to indicate which of the different types find most scope in 
vocational procedures such as those developed in the Forces. 

Group V. Individual .—^In large-scale vocational work, little indi- 

* The squares of the correlations provide a better index of the make-up or 
content of teats. Thus in the case of Matrices with g and k ;m loadings of '70 
and ’16, the variance attributable to these main types of ability is 02'4l per cent, 
and 2'a6 per cent., leaving 3'6'34 pet cent, unaccounted for. This amount 
represents 'group factors that have not been isolated, together with the test’s 
unreliability and specificiw* The actual factor loadings of many of the chief 
naval and army tests are listed by Vernon (1047b). 
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vidual testing is possible, and aa big numbers are needed for 
trying out and validating any test (whether it is to be rarely or 
frequently applied), the tendency among vocational psychologists 
is to convert more and more of their instruments into group form. 
Even the interview, fof example, is partly displaced by the bio¬ 
graphical questionnaire which can be filled in by a group under 
supervision. Mechanical assembly tests employed in the Army and 
A.T.S. were adapted so that eight testees could be dealt with 
simultaneously. With sufficient ingenuity there is no reason why ^ 
the majority of performance and apparatus tests -should hot i 
become aiSdmaticil!^^TF-scdfing. 'Another reason for this tend¬ 
ency is that fully qualified psydiologists caimot afford to spend 
much time on testing; tests, and their scoring, must, therefore, be 
simplified as much as possible, and the personal element in indi¬ 
vidual testing reduced, so that accurate testing can be done by less 
highly-trained individuals (cf. U.S. War Department, 1946). 

Paper-and-Pencil v. Practical .—The former includes, besides 
verbal intelligence, educational- or trade knowledge tests, non¬ 
verbal g and spatial judgment tests based on abstract diagrams, 
and others—^particularly in the mechanical field—^based on 
pictures. Some tests are intermediate—^for example, morse or code- 
learning tests—^involve the practical discrimination of auditory 
patterns and written responses. Owing to the difficulties of con¬ 
structing reliable apparatus or performance tests in war-time, and 
of maintaining sufficient sets in standard condition for application 
to thousands of recruits a week, there was a tendency to rely very 
largely on paper-and-pencil tests. This occurred, too, in America, 
where the difficulties were less acute. Commenting on the same 
tendency after the first World War, Hull (1928) surmised that a far 
greater variety of behaviour could be sampled by performance and 
apparatus tests. A,dpiittedlv t here may be. sprne danger of hmdi- 
cagping the “hand;^an,” or the recruit with manipulative skills 
whoTuerhans through .no fault 6£ his own, left school early and is 
in efficien t in reading md wri&g. Whether we are favouring the 4 
“bookish” man in sdi johsj .hoth in the Forces, the Civil Service,fj 
the police, industry, etc., is a problem that nee^ serious investiga-K 
tidhrBut the danger should not be exaggerated, since "practical¬ 
ness” is another rather dubious construct or hypo^etical bait. 
CertairdifarfiSn may do better, on a performance test of inteiligence| 
th^onTwritten one, or on a manipulative assembly tesUhan on a 1 
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pictorial mechanical one. But there is so little evidence as yet of the 
superiority of such a man in engineering or other practic al jobs 
that the introduction of practical tests may not be -worthwhile from 
I the economic standpoint, though, nevertheless, often just^able on 
the,grounds of “sales value,” i.e. attractiveness both to testees and 
(to.£iDployers. 

Paper-and-pencil tests may be subdmded into “ expe ndable” 
and “non -exp endable.” In the former the test blank itself is used, 
inthelatter an^CTs are written or (as in American machine-scored 
tests) marks are made in the appropriate positions on separate 
answer sheets, and the test blanks themselves can be used many 
times over. Non-expendable tests are often preferred because they 
save paper and are more easily scored, but they undoubtedly 
j i mpose an additional handicap on duller testees. The creative or 
inventive fea^dhs'e tj^e of test item is also more suitable for such 
testees, and was largely employed in British tests because multiple- 
choice or selective-response tests and examinations are still so' 
unfamiliar in this country. But it is difficult to construct inventive 
items which have only one right answer. Various alternatives are 
possible both in the information and mathematical tests described 
below. Even when all permissible answers are listed, therb is still 
some danger of differences in scoring between different testers. 
American psychologists claim an additional advantage for selective- 
response tests, that thetestee does not have to phrase his answer 
and can, therefore, express his knowledge or ability more directly. 
Such arguments have been critically discussed by one of the 
writers elsewhere (Vernon, 1940). 

Aptitude v. Attainment Tests .—^The contrast between inborn 
and acquired ability is less stressed in T^senF-day ps ychblb^l EEm 
in''lhe ^tj'smce if is realised lhat every test involves botii. A 
reStrve (Ekinctiori is useful, as Traxler (1946) puis it, between 
t ests based on t asks in which therejhas been little or no formal 
Itrainirg, and oth.ri" based oh"task8 similar to those that have 
i i'lriiii-.i y '.'(.e.i .'■.■■.dit.;'. Thus, vocabulary constitutes a legitimate 
\ test of inteUigtoce bbcause knowledge of words is chiefly picked up 
incl^entaliy. jnfpm^ tests such as S.P. 117, designed to show 
hsm mudi general mechanical and electrical information a mm has 
acquired from hobbies, from doing odd jobs at home or handwork 
at schbb ti o r from reading (but -which exclude items invqilwng 
Specffic trade experience) appeared to provide better indications of 
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interest in, and aptitude for acquiring further skill in, mechanical 
jobs than did any tests belonging to the more conventional aptitude 
category. Similarly in predicting university achievement, it ia often 
fouoii,tbat.preyiou3 attainments and tests of snch generalised .study I 
skillsjs reading comprehension are of more value than intelligence ' 
test8-.(cf.-'Venion, i939a; Eysenck, 1947b). ( 

Analytic, Analogotis and Work-sample Tests .—^This classification 
of vocational tests overlaps closely with the previous one. Analytic 
’ tests of the elementary sensory and muscular functions believed to 
be involved in a job, or analogous tests which parallel the main job 
operations, are the generally accepted tools of vocational selection, 
since they are supposed to be uninfluenced by the unequal experi¬ 
ence or training candidates may have had. While numerous investi¬ 
gations &vourable to such teSts have been reported, particularly 
by German industrial psychologists, they do not bulk largely in 
most modem vocational schemes, for several reasons. 

(1) There ia clearly considerable truth in the claim of Gestalt 
psychologists that every industrial operation is a whole which 
is more than the sum of its elements, iPrgfigiency: depends j 
leas on the sensory-motor capacities covered by analytic | 
tests thm on the successfiil integration of such capacities, 1 
Analogo us tes ts also often ^ve poor predictions because the 
structure of the abilities which they test differs in some 
essential respect from The stmcture of the job itself. . 

(2) The ahafj^c or analogous approach is extremely susceptible 

to the “naming fallacy." ' 

(3) Vocational'classification usually has to cover so great a 
variety of different jobs that it vvould be impossible to devise 
separate batteries , of tests even for the main ones, or to put 
all the candidates through several batteries. Tba.Qo]y..praQr> 
ticable approach is to test the common factors or generdised 
abilities which enter into a %ge number of jobs by a lumted 
batte g of tests, and thm, perhapjs, ,tp add one confirmatory 
testjn jach main job which can .be .applied to ,,Ukely .psuir, 
didates for that job. 

(4) This approach is justified by. the findings of factor analysis. 
In a heterogeneous population such as recruits or school- 
leavers, factors such as g, v : ed and k : m, when supple¬ 
mented by biographical data, cover a major proportion of 
the ground, and the addition of specialised tests for particular 
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jobs adds relatively little to the success of predictions. There 
tnay be more justification for applying batteries of specialised 
tests in vocational selection where the candidates form a 
more homogeneous group (cf. Chapter XII). 

“Work-sample” tests appear to be differently interpreted by 
different industrial psychologists. Thus Viteles (1932) implies that 
work-sample testing involves giving the candidates some pre¬ 
liminary training on the job itself and deducing their eventual 
proficiency from their initial results or rapidity of learning. Others 
use the term of any trade test sucli as the dictation of a standard 
passage at standard speed to test stenographic ability. Several of 
these were widely used in the Forces for evaluating the trade 
experience of recruits. 

We may conclude that the most appropriate tests for vocational 
classification are of three kinds: 

(i) Group tests for the main factors underlying proficiency; 

(ii) Work-sample tests and tests of trade knowledge; 

(Hi) Supplementary tests intermediate between the analogous; 
and the work-sample type, which attempt to reproduce, as it 
were in miniature, certain essential features of the job not 
covered by (i). Of those described below Agility (Test 16), 
Morse Aptitude (Test 10), Asdic gramophone records and 
the N.I.I.P. clerical test, come under this heading. The 
fact that these were to some extent influenced by training 
or experience did not greatly detract from their value, both 
because recruits with relevant experience were found to be 
so much more readily trainable (cf. Chapter VIII) and 
^ecause, as pointed out above, attainment generally gives^ 
useful predictions of aptitude. IVtoreover, there was the 
safeguard that the P.S.Oi could interpret the test scores in 
the light of opportunities. A high score on, say, a clerical 
test would be more significant in a recruit whose civilian 
job was non-clerical. 

Planning and Construction of Tests 

The traditional approach of the vocational psychologist is to 
start with a detailed job analysis of the methods of good and poor 
workers at the job, to construct a large battery of analytic or 
analogous tests for measuring the abilities so revealed, and to 
validate this battery empirically (cf. Bingham and Freyd, 1926; 
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Hull, 1928). We would suggest that this applies only in vocatio nal 
selection schemes whose scope is limited to one or two particularly 
important jobs, the applicants for which are fairly homogeneous in 
g, education and experience. Certainly the psychologist concerned 
with vocational classification should try to study workers on the 
job, and should consult supervisors, training staff, etc., regarding 
the nature of the work, but with a rather different aim in view. 
First he needs to know as much as possible about the physical and 
social conditions under which the job is performed, and, secondly, 
lie should try to assess what qualities are not likely to be covered 
by his standard tests and by biographical information. In radar and 
asdic operating, for example, even when these have been given full 
weight, there are likely to be left over certain visuo-perceptual and 
auditory capacities which do demand special tests. Both jobs, too, 
appear to demand a lot of knob-twiddling and concentration of 
attention, but the experienced psychologist recognises that the 
effort to try to measure such qualities would probably be wasted. 
If, as in these instances, additional teats appear desirable (or if 
follow-up evidence shows that his predictions are insufficiently 
accurate), these tests are designed, not so much to measure hypo¬ 
thetical aptitudes as to duplicate as closely as possible the remain¬ 
ing important aspects of the work. Thus, in planning tests for asdic 
operators (cf. p. 26) it was mistaken policy to list sense of pitch, 
sense of timbre, auditory threshold, etc., and to apply tests like 
Seashore’s which claim to measure these. In fact, such tests merely 


acted as rather inefficient tests of general intelligence. But by con¬ 
structing gramophone records with asdic noises which resembled 
the discriminations needed in the job itself, which were also so 
simple that their dependence on g was low, a distinct contribution 
was made to the prediction of operating efficiency. 

In the pl ann ing of tests for vocational classification, there are 
two factors which are on the whole more important than detailed 
job analysis, namely, the experience of the psychologist, and prac¬ 
ticability. The psychologist needs to Ipiow; what Idnds of tests have 
worked wellIniEhe past in selecting,for similar jobs, and should be 
faimlSr" with the evidence from American and other relevant 


literature'.''' 


'The economic aspects of testing have been recognised by many 
writers, though hardly sufficiently stressed. Hull admits that 
progress in testing is likely to depend as much on the reduction of 
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testing costs and the development of mechanical methods of 
scoring and of weighting scores, as on the predictive value of the 
tests themselves. Viteles points out that it is not worth while 
embarking on a programme of testing unless a sufficiently large 
number of sufficiently co-operative workers can be ensured on' 
•whom to try out and validate the tests, and unless a satisfactory 
criterion of their proficiency is av^lable. In the Forces, for example, 
it appears so difficult to obtain an adequate measure of officer 
ability (nobody knows, or no two authorities agree, as to just what 
constitutes a good or a poor officer) that it is hardly worth attempt¬ 
ing to devise tests of officer aptitude. Other important considera¬ 
tions include; 

(1) What kind of tests can be given arid scored by the available 
‘ staff in the available time, both at the experimental trial 
stage and in routine practice ? Most of the Service tests listed 
below are very short (ten minutes or less) because of the 
restrictions on testing time and because of the danger of 
recruits becoming “fed up” if over-tested. Even the effects 
of practice which would arise if recruits took several hours 
of teats are shown to be serious, in the next Chapter. As 
much relevant information as possible should be obtained 
from questionnaires and iiiterviews which do not look like 
tests, even if, in fact, the answers are standardised and 
validated in such a way as to constitute additional tests. 
Headmasters similarly tend to resent allotting more than 
an hour or two to the testing of school-leavers. Promising 
tests of interests and of neurotic tendencies could not be 
introduced in the Army because their scoring (in the absence 
of machine methods) was too lengthy, 

(2)' Will the paper or other materials be forthcoming in the 
future as well as at present ? Can apparatus be maintained in 
proper working order ? Th^ paper shortage in the war con¬ 
siderably restricted the use of tests containing spatial or 
mechanical diagrams and pictures, and the diversion of 
almost all skilled mechanics into the Forces or into arma¬ 
ments production rendered it virtually impossible for 
psychologists to adopt any tests involving elaborate appar¬ 
atus.' Even a very simple test of visual acuity which was 
applied to all A.T.S. recruits during one period had to be 
abandoned because it was yielding utterly incousistent 
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results in different places. A complex reaction time test for 
motor drivers (Chleusebdrgue, 1939), “miniature situation” 
tests for layers and for radar operators, tests of ability to tallt 
over the telephone, and tests of night vision (as distinct from 
dark adaptation) were others which were regarded as valu¬ 
able but impracticable. 

(3) Can the psychological and statistical staff cope with the work 
needed in devising, analysing and calibrating a new test? 

Is it likely to be sufficiently superior to older established 
tests to justify this expenditure of technicians' time ? 

(4) The range and general level of intelligence in the population 
for which the test is intended must also be borne in mind. 
Some tests which might be quite practicable at upper levels 
may need far-reaching modifications and experimentation to 
be got across to duller men. Without this they may merely 
measure g over again, or be highly, susceptible to practice 
effects. A warning has already been given against separate 
answM^sheetS;. and selective-response items. F igu res, dia- ■: 
grams,and.drawings need to be large and clear; thus, here, > 
tooHack of paper may be a limiting factor. 

The production either of a general or a supplementary test is a 
highly complex and technical business, and we can only touch on 
a few points here. When the purpose and nature of the test have 
been decided, suggestions for items should be collected from 
several psychologists or other sources, and they should be carefully 
scrutinised and revised to eliminate ambiguities, to ensure tech¬ 
nical accuracy, and to cover the requisite range of difl&culty. A 
large excess of spare items should be included, and if alternative 
forms are likely to be needed, it is preferable to devise and try them 
all out simultaneously. Suitable instmctiohs, for the testees and 
for the testers, must be prepared and time limits decided, subject 
to modification in the light of preliminary trials. Usually at least 
two large-scale trials, on popdations which number several 
hundred and which parallel the populations for whom the test is 
intended, are needed for the following purposes: 

(1) To see if the layout and instructions are adequate, and 
whether there are gross defects in any items. 

(2) To validate the test as a whole and to show that it adds 
appreciably to other existing tests, or is superior to other 
proposed tests. 
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(3) To establish the difficulty, also the consistency and/or 
validity of each item. 

(i) To validate the final form or forms and determine its pre¬ 
dictive value in one or more jobs. 

(6) To find its reliability. 

(6) To analyse the main factors it involves by comparison with 
other tests. 

(7) To calibrate, or determine its norms. 

Calibration 

Neither the obtained "raw” score, nor the percentage score on a 
test or examination provides a precise indication of the goodness or 
badness of performance. The results of intelligence or educational 
tests among children are, therefore, often expressed as mental or 
educational ages. Though it may be useful to know that, say, the 
arithmetic of a poorly educated adult is only equivalent to the level 
achieved by average ten-year-old chiWren, age units are scarcely 
applicable to superior or even to average adults. Abilities, ipcjease 
or^dedine,after fourteen years in so varied a manner (cf. Chapter 
XI) that no fixed standards can be determined. Intelligence 
quotients or I.Q.s are sometimes used with adult intelligence tests, 
for example, the Wechsler-Bellevue scale, but are disliked by 
vocational psychologists not merely because their derivation differs 
from that among children*, but also because comparable units for 
other tests are not available. For example, it might be more con¬ 
fusing then helpful to turn mechanical comprehension or morse 
aptitude test scores into mechanical quotients and morse quotients, 
^d so on. 

In general, therefore, adult test results are interpreted by per¬ 
centiles. For example, a score of 46 on the (twenty-minute) 
Matrices falls at the 90th percentile for representative adults, since 
90 per cent, score 46 or below. Similarly a score of 36 is the median 
or 60th percentile. These percentile levels can be based on any 
convenient group. Thus, in the Navy different percentile norms 
were available for ordinary seamen and for R.N.V.R. officer cadets. 
As, however, the percentile scale is unnecessarily detailed for most 
purposes, both the- Navy and Army divided the ranges of scores 

* Instead of being based on the ratio of mental to chronolomcal age, they are 
usually obtained by converthp.g test scores to an arbitrary scale with a mean of 
100 and a standard deviation of 16 or 16}-, or 20, etc. 
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on most of their tests into selection grades or groups (S.G.s) * 
Group A or S.G.I. represents the top 10 per cent, of scores, i.e. 
the outstanding. 

Group B or II represents the next 20 per cent, of scores, i.e. 


above average. 

Group C or III+ and III- represents the next 40 per cent, of 
scores, i.e., average. 

Group D or IV represents the next 20 per cent, of scores, i.e. 
below average. 

Group E or V represents the bottom 10 per cent, of scores, i.e. 


very low. 

If the game grouping was applied to the Terman-Merrill 
revision of the Stanford-Binet test the I.Q.s falling into these 
various grades would be approximately: S.G.I 121 and upwards, 
S.G.II 109-120, III+ 100-108, III- 92-99, IV 80-91, V 79 and 
bdow. Such a system is easily got across to the layman, and has the 
great advantage that the S.G.s or the percentiles for all tests 
standardised on the same population are comparable. A 70th per¬ 
centile score, for exanlple, represents the same relative degree o 
superiority on Matrices or mechanical comprehension or morse 


aptitude. . 

Yet another form of calibration sometimes used m vocational 
work consists of minimum or critical test scores needed for 
entrance to certain jobs. This is an undesirable system for several 
reasons. First, it neglects variations in supply and demand. 
Obviously the critical score can be raised when there are plenty ot 
candidates for a few vacancies, and must be lowered if the situation 
is reversed. Secondly, such rigidity is essentially unpsychological. 
We have shown that test scores constituted only one of me many 
factors influencing employment recommendations. P.S.O.s were 
encouraged to select men with lowish scores who possessed other 
outstanding qualifications, and to raise their stmdards if they did 
not. Thirdly, there are numerous technical difficulties in Imng 
satisfactory pass marks (cf. McClelland, 1942), particularly when 
several tests need to be taken into account. 


but it is leas readily grasped by the non-psychologist. 
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Interpretation of Test Results 
Although we have rejected the notion of critical scores, we can¬ 
not shirk the problem of providing some method of indicating the 
relevance of test performances to jobs. Not only the P.S.O. but 
also the interested layman wishes to know how accurately or 
inaccurately a battery of tests will select, and at what level of scores 
candidates are likely to succeed or fail. The only method which 
satisfies the statistician is that of correlation and regression coeffi¬ 
cients, but, unfortunately, it is hardly intelligible to persons with¬ 
out statistical training, and is often scarcely practicable. Most of 
the proposed alternatives such as “wastage coefficients,” “efficiency 
indices,” etc,, are very cumbersome and tend to be misleading. For 
example, a common procedure is to compare the distributions of 
test scores of men who turn out to be good and bad at the job, and 
to cho ose a pass mark or critical score which admits as many of the 
formetjan4jr^‘jeclS-as,.n^ latter as .possible. This is repre¬ 
hensible since such distributions (unless obtained from thousands 
of cases) always show irregularities, and one is tempted to choose 
a mark which, by taking advantage of irregularities, gives the most 
favourable differentiation. In fact, the smaller the groups com¬ 
pared, the more optimistic is the forecast of the test’s validity likely 
to be. Moreover, the numbers or proportions thus found to be 
correctly selected or rejected depend to a large extent not merely 
on the goodness of the test, but also on the proportions of goods 
and bads, and on the proportion of candidates which P.S.O.S can 
afford to exclude. As pointed out in Chapter VII, the higher the 
selection ratio, the better will the selection test appear to be— 
provided that candidates wrongfully excluded by it are con¬ 
veniently forgotten. Other common procedures are to list the 
percentages, or to draw graphs or histograms, showing the pro¬ 
portions of men at various te?t score levels likely to do well or badly 
at the job. These, too, suffer from the defect of being incomparable 
with jobs that have different failure rates. A slightly more satis¬ 
factory system was adopted by the U.S. War Department (1946), 
and is illustrated in Fig. 3. Tlfis shows that candidates scoring 140 
on the clerical aptitude test had 84 chances in 100 of achieving 
better than average results at a training course for clerks, whereas 
candidates with scores of 60 only had 7 chances in 100 of doing as 
well. This method can at least be applied uniformly with all tests 
and all jobs, but it does not readily indicate how low the P.S.O. can 
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go in selecting men unlikely to fail the course. Nor, when several 
tests are predictive of success, does it indicate how their findings 
are to be combined, or how much attention should be paid to each. 

In the British Army, S.G.s were suggested for each main job on 
the tests believed, as a result of job analysis, to be most relevant. 



these being modified appropriately in the light of distributions of 
scores among men actually engaged on the work, and of later 
validation data. Thus, a minimum of S.G.3 —might be set on the 
clerical test, but no minimum on the mechanical comprehension 
test, for clerks. Though this system is generally applicable and 
adequately flexible, it was unsatisfactory for several reasons. 

(1) No information was provided as to the likelihood of success 
or failure above or below the minima. 

(2) The iipposition of different minima on different tests is not, 
in fact, equivalent to weighting the tests in accordance with 
the obtained regression equation. 

(3) Some P.S.O.s found it too complicated and preferred to rely 
on their own hunches, but others followed it too literally, 
with disconcerting results. For example, if no standard was 
set on one test, say arithmetic, most of the weak arith¬ 
meticians would be herded into this job. 

It , is the writer’s conviction that any sudi mechanical system of 
test interpretation would soon be liable to develop similar flaws. A 
scheme which was developed in the Navy gives the distributions 
among passes and among failures (or men regarded by instructors 
as very poor), on two or three of the most relevant tests only. 
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Usually one test is the measure of all-round ability, T2 and the 
others are supplementary ones to which special attention should be 
paid*. As an example, Table XII lists the information for coders. 


Table XII.— Selection Test Data for Coders 
Period; July-Sept. 1943, N = 628 


S.G. 

r2 

Pass Fail 

Test 1 
Mstraetion 
Pass Fail 

Test 70 
Dictation 

Pass Fall 

A. 

31 i 

2Bi i 


B. 

41 2| 

49 3 


C. 

26 7 

19 7 


D. 

2 

2 


E. 

i 1 

— “ 

mMtM 

90 percentile 

126 

16 

35 

Median 

100 

13 

33 

10 percentile 

72 

10 

' 30 

Correlation with 




practical training 

■643 

Cl 

OD 

•370 


The figures could equally well be presented in the form of a histo¬ 
gram. The pass columns give the percentage S.G. distributions 
(to the nearest half) among trainees found satisfactory. The fail 
columns give the proportional numbers. Thu^, the actual failure 
rate was 11'6 per cent., corresponding to 12^ fails for every 100 
passes. Percentile levels among passes are also shown. Thus, the 
P.S.O. can readily see that As and Bs on the three tests have 
excellent chances of passing, that Cs are doubtful bets, and that 
only Ds with outstanding additional qualifications are likely to be 
suitable. At the same time as indicating desirable standards for 
I selectees, this scheme provides evidence of the value of the main 
tests and involves no statistical manipulation of the empirical data 
which would be liable to misinterpretation. It might well be 
adapted to civilian educational and vocational classification pro¬ 
cedures. For example, similar tables might show the adequacy of 
a child’s scores on general intelligence, educational and mechanical 
tests in relation to academic or to technical secondary schooling. 

* In moat naval jobs it was found that a very close approximation to the 
multiple regression equation could be obtained by T2 alone, or by T2 plus 
one or two supplementary testa equally weighted. 















CHAPTER XI 


EFFECTS OF PRACTICE, AGE AND OTHER FACTORS 
ON TEST SCORES 

Abstract .—Investigations of civilians (adults and children) and of 
recruits show an average rise, on taking a psychological test a 
second time, equivalent to about 6 LQ. points, apparently regard¬ 
less of the proximity of the re-testing. Further re-tests produce 
progressively smaller increases, hence the provision of practice 
sheets or of one or two preliminary tests, helps to even up testees 
who possess different degrees of previous experience. The effect 
is smaller in straightforward tests with ample time limits, than in 
choice-response tests or tests with complicated instructions, with 
which moat testees in this country are unfamiliar. Though the 
practice effect of one test on another is about twice as great when 
they are identical as when they are only partly similar in form or 
content, yet any kind of schooling and the taking of examinations 
may produce slight improvements among adults. 

Recruits did not constitute sufficiently representative samples 
of the general population to provide definitive results on changes 
in teat scores with age. However, the more rapid decline in abstract 
intelligence, spatial and physical abilities than in educational 
abilities was confirmed, and considerable evidence was obtained 
of greater declines among adults of initially low ability. Not only 
education but also intelligence appears, to be better retained by 
those who “use their brains” more. Again, mechanical-spatial 
abilities show increases from 14 to 18 years; general intelligence 
probably rises only among those receiving further education, and 
educational abilities decline among those not receiving schooling. 

Small but consistent geographical differences in intelligence 
were found between the east or south and areas which contain 
considerable proportions of Welsh and Irish recruits. Intakes into 
the three men’s Services varied at different times, and average 
differences between Army, Navy and R,A.F. ground personnel 
■ were small. However, the Army certainly received the biggest 
proportions of low grade, arid R.A.F. air crew large proportions of 
> 181 
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high grade, recruits. Members of pre-Service organisations (A.T.C., 
Sea Cadets, Army Cadets, etc.), also Scouts, showed superior 
qualities in the Services. But this was attributable more to the 
intellectual, educational and other traits of youths who join these 
organisations than to the training as such, which was shown to 
produce rather limited effects. Physical training courses were 
found to improve the scores of Army recruits with poor physique 
on certain physical and mental tests, but menstruation among 
A.T.S. recruits had no consistent influence on test performance. 

A very large proportion of recruits in the Forces and of adults 
in industry have not met an intelligence, or other psychological, 
test before, hence their scores are likely to improve considerably 
as they gain familiarity with such tests. It is important to determine 
the extent of such improvement, both because different recruits no 
doubt possess different degrees of previous experience, and 
because re-testing is frequently necessary when original scores, 
cannot be traced. We will first outline some of the published 
evidence on practice effects. 

Dearborn and Rothney (1941) summarise most of the American 
work and show, from their Harvard Grovrth Study, that practice 
effects are generally not very large, but that they do occur with 
some—^not all—group tests. They do not seem to be confined to 
any particular type of test material. With repeated testing the 
effects tend to diminish, i.e. the greatest increase is from the first 
to the second test. These authors also claim that practice on one 
test affects that test only and does not extend to other slightly 
different tests. 

Some, but not aU of these results are confirmed by British investi¬ 
gations, and we would suggest that the difference lies in the greater 
degree of “test-sophistication” among American children and 
adults. Not only do they habitually take more intelligence tests, 
but also most of their examinations are new-type ones, which are 
made up of questions similar to those in intelligence tests. We' 
would therefore expect rises to be larger in this country, and 
spreading to occur from any one test to other dissimilar ones. For 
example, Vernon (l938b) investigated test-sophistication among 
university-trained adults and showed that the study of tests in 
general and their make-up,' and practice in taking various types of 
teat, raised scores both on verbal and noii-verbal intelligence tests 
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which had not been taken before by as much as 84 per cent.*" An 
American study by Bowers and Woods (1941) points in the same 
direction. They compared the intelligence test scores of over 1,700 
students who had tiken none, one, two and three or more intel¬ 
ligence tests during their school careers. If the scores of the no¬ 
experience group are reduced to 100, the scores of the other three 
groups average 106, 108 and 110 respectively. Evidence is then 
given to show that the differences even up in the course of a year’s 
work at college, and that they are far greater among students from 
rural areas and small towns than among students from large towns. 
A plausible explanation for this is that the big high schools in large 
towns would make more use of new-type achievement tests, and 
that their ex-pupils would be more test-sophisticated. 

A. G. Rodger (1936) applied six parallel tests at fortnightly 
intervals to ninety-five British 11-12 year old pupils and claims 
that the average increase was about 1 per cent., or 1 I.Q. point, 
per test, His figures suggest, however, a rise of 3,8 per cent, 
between the first and third tests and thereafter no further change. 
The Moray House tests which he used have a fore-exercise or 
pradtice sheet, which probably minimises the susceptibility of 
the tests themselves to practice. Rodger also states that the 
rise was greatest (1^ points) in brighter children of I.Q. about 120, 
and lower (J point) in dull children of I.Q. about 80. Dearborn 
and Rothney make a similar claim. It is a nice theory that the most 
intelligent, because of their intelligence, improve most. But all our 
evidence, cited below, shows the greatest rises among those scoring 
least. Dearborn’s and Rodger’s conclusion is> in fact, based on 
changes in test norms, not on rises among actual pupils. 

McRae (1942) likewise gave sets of six tests to small groups at 
weekly intervals and confirms the diminishing effects of repeated 
practice. He concludes that when testees vary initially in their 
previous familiarity with tests, a single test will act as a sufficient 
“shock-absorber” to bAng than all on a par. He also noted, when 
giving parallel versions of a test, that the effects on Form B of 
actual coaching on Form A were no greater than tfee effect^ of 
merely 4o™g Form A in the ordinary way. 

Dave (1938) found that some types of test are much more 
affected by practice than others, non-verbal and spatial items 

♦All SCOT? Ganges quoted in this Chapter have been TOnverted into a 
conunon scale, roughly parallel to intelligence quotients. That is, each alteration 
has been divided by the Standard Deviation of scores and multiplied by IB. 
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apparently being more susceptible than most verbal items. Finally, 
recent researches by Heim, Wallace and Carpenter showed that 
continued practice bn a single test may improve scores almost 
indefinitely*. When nine W.EA. students took the A.H.4 test 
eight times in successive weeks, there was some tendency to 
flattening out after about the fifth occasion, but some subjects con¬ 
tinued to rise until they attained nearly perfect scores. Nine seamen 
tested nine times on almost consecutive days gave similar results. 
These show the dangers of leakages of test material and of 
unauthorised coaching. It was noteworthy, however, that the 
correlations between “impractised” and “practised” test scores 
remained extremely high. Thus a test may still be a valid test of 
intelligence after practice, provided that all testees have had the 
same amount of practice. 

Turning to work in the Forces, the Matrices test was re-taken by 
637 seamen in an entry establishment one to six months after it 
had been done at recruiting centres. The average rise was 4-7 points 
or 8-6 per cent. But as the feliability of this test is rather low, the 
correlation between the two sets of scores being only ‘79, the 
alterations were irregular. Some men actually declined on the 
second occasion, and the total range of changes was from 26 points 
increase to 13 points decrease. A natural consequence of what is 
called the regression effect is that very high scorers showed least 
improvement, very low scorers most. 

An experiment in the Army where the same test was re-taken by 
277 men after only one day yielded almost the same rise, and other 
later work suggested that practice effects are much the same after 
several months as they are immediately. But a possible alternative 
explanation is that military or naval training, including the taking 
of proficiency examinations, also helps performance at tests, hence 
a rise after six months may be partly due to this and only partly to 
recollections of the previous testing. Sometimes, of course, the 
training received is directly relevant to the abilities tested, e.g. 
mathematical or mechanical. This would account for the different 
findings' inf the major Naval and Army experiments on test 
reliabilities and re-test rises. 

The Army results listed in Table XIII were based on re-testing 

♦ From vmpublished work communicated by courtesy of the M.R.C.’s Unit 
for Research in Applied Psychology, at the Cambridge Psychological Labora¬ 
tory. 
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Table XIII.— ^Pkrctntage Increases on Re-testino Among Aemy and Naval 

Recihiits 


Test 

Percentage Pises 

Army 

Navy 

1 Abstraction .. .. '. 


3-2 

2 Bennett Mechanical .. 

4-0 

10-4 

3 Arithmetic Mathematics . 

2T 

8-6 

4 Squares .. .. . 

8^0 

10-6 

26 Verbal . 

3-3 

— 

12 Clerical . 

6-6 


8 Assembly., 

6-0 

— 

10 Morse Aptitude 

3-9 


16 Agility* . 

7-2 

-- 


600 men, representative of the total intake, after eight weeks of 
primary training (which involved little or no bookwork). In the 
Navy, however, 600 air mechanics were re-tested after six to eight 
months of mechanical training. In spite of the longer interval their 
increases on Tests 2 and 3 are far larger. Squares (Test 4) is 
slightly larger, but Abstraction (Test 1, not taken in the Army) 
was probably uninfluenced by their training, hence it shows only 
a small rise. 

The usual regression effect was observed on most testa. The top 
10 per cent, of scorers achieved only a slight improvement or even 
declined on the second test, and lower groups showed successively 
greater increases. But in three tests—^Verbal, Clerical and Squares 
(the latter in both Services)—men scoring in the 90th to 60th 
percentiles at the first test showed as great improvement as those 
in the bottom half. Probably this is due to the positively skewed or 
almost rectangular distributions of scores on these tests. 

The total increase on T2 in the Naval group was larger than 
that on any component test, namely, 12-4 per cent. This is very 
considerable, being equivalent to an average rise of nearly 1 S.G. 
Even if half of it is attributable specifically to the mechanical 
training, it means that T2 is very sensitive to the practice effects 
of previous testing and of more generalised schooling. 

Some evidence indicates that the effects of taking a different set 
of tests before are smaller than the effects of doing the same test. 
The average increase, shown in Table XIII on eight Army tests, 

* The high figure for Agility might be ascribed to improvoment of physique 
among recruits in their first eight weeks of Army life. However, in an earlier 
experiment, an almost immediate re-test yielded a still greater improvement of 
nearly 12 per cent. 
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also on Matrices and Abstraction, due to taking the same test 
before (along with others) amounts to 6*3 per cent. But in an 
experiment described below (p. 107) the Naval T2 tests and the 
R.A.F.’s G.V.K. tests were taken by the same recruits a few days 
apart, and the practice effect on whichever battery was taken second 
was estimated as 3-6 per cent. This is similar to the 2^ per cent. 
(2 to 3 points of I.Q.) admitted by Terman and Merrill when 
Form M of the revised Stanford-Binet test is given shortly after 
Form L, or vice versa. 

Further consideration of Table XIII suggests that the biggest ■ 
effects occur in tests which are most novel, or which have most 
complicated instructions. Squares and Clerical are particularly 
difficult to ‘‘get across.” The Arithmetic test, which is entirely 
straightforward and familiar, has the smallest increase of 2-1 per * 
cent., and the Naval Abstraction test, whose instructions are so 
simple that it is self-administering, is also very low. This is borne 
out by the following investigations. 

Several versions of a mechanical comprehension and information 
test, and of a mathematics test, were devised for use at recruiting 
centres, and were simplified in order to make them self-administer¬ 
ing. These were tried out on 1,400 recruits in H.M.S. Roycd 
Arthur of whom 190 had talcen a version of these tests a few 
months earlier at recruiting centres, while the remainder had taken 
Matrices. The former group showed no superiority on the mathe¬ 
matics section, and were only 1-6 per cent, superior on the 
mechanical section as a result of their previous experience. 

In an Army experiment. Matrices was given .twice on con¬ 
secutive days with a forty-minute time limit to 270 recruits, and 
an alteration of under 6 per cent, was found, contrasting with the 
figure of over 8 per cent, obtained with a twenty-minute limit. 
The longer limit presumably allowed the testees to get more used 
to the unfamiliar task*. Other studies were made of dictation and 
spelling tests. Five passages were tried out on 647 seamen in an entry 
establishment, each class getting a different set of two passages in 
different order. The average scores on the second passage were 
only 1-2 per cent, superior, and this is perhaps attributable to 
familiarisation with the dictator’s pronunciation. In another 

• In this instance the "ceiling effect" may have entered; with a 40-minute 
limit scores tend to be so high that little improvement is possible among the 
brighter testees. On most of the teats whose results are quoted in this chapter, 
however, there was ample "ceiling.” 
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research, four groups of fifty recruits took five tests in immediate 
succession. One was the A.T.S. Spelling, two involved dictation of 
sentences in which one difficult word had to be written down, and 
two were of a novel t 3 ^e. Each item consisted of three near¬ 
synonyms, one of which was incorrectly spelled. Testees had to 
identify and re-write the wrong words. For example: 

STREET RODE AVENUE. 

No practice effect whatever could be discovered in any of the tests 
except this last, unfamiliar, one where it amounted to about 2 per 
cent. 

The following appear to be the main conclusions arising out of 
the investigations we have described; 

(1) Test scores are seriously disturbed if different testees have 
had different amounts of practice on the particular test. 

(2) The effects of taking other similar tests are smaller though 
still very considerable. 

(3) Such practice tends to show diminishing returns. 

(4) The effects of practice are probably lasting, being almost as 
great after some months as after a few days. This point 
requires much fuller investigation. 

(6) Apart from training in the subject matter of the test, it is 
possible that any kind of schooling and the taking of 

“ examinations (especially new-type ones) have appreciable 
effects. 

(6) The effects tend to be greatest among those who are least 
accustomed to tests, and possibly to bookwork in general. 

(7) The effects are much smaller in straightforward inventive- 
response t^sts with simple instructions and ample time 
allowance than in unfamiliar tests with elaborate in¬ 
structions, also in verbal than in most non-verbal and 
spatial tests. 

(8) They can be reduced by adequate fore-exercises. Moreover, 
in view of !No. 3, testees with different amounts of experience 
can be brought to a more even level by taking one or two 
preliminary tests. 

(9) Re-testing with the same, or parallel, tests is undesirable, 
but when unavoidable distinct norms or standards should be 
provided. So far as is known re-test scores have the same 
validity as original scores, but this, too, requires further 
study, I 
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Age and Other Differences 

Many results of general interest might have been anticipated 
from the large-scale testing of recruits during the war, for example, 
comparisons between the three Services, between men from 
different parts of the country, or from different occupations, and 
so forth. Such enquiries are, however, much'more complicated 
than they appear at first sight since age, occupation, recruitment 
policy and other factors that influence test scores are so inteiwoven. 
Eighteen-year-old recruits were probably fairly representative of 
the population as a whole, at least in 1942-3, though even at this 
age a considerable number had their call-up postponed, since they 
were in reserved occupations; and these were mostly in skilled 
trades or professions and, therefore, of superior intelligence. A 
very small proportion of men of low medical category (mostly 
below average in intelligence), of psychiatrically unfit and of mental 
defectives, were rejected. The composition of older batches oiF 
recruits was extremely variable owing to the Ministry of Labour’s 
reservation policy. In one month, for example, large numbers of 
20-26 year old policemen were de-reserved and intakes into all the 
Services were of outstandingly high quality. In another month 
there was a big proportion of 30-40 year builders, and the quality 
of intakes was lowered partly because the average for building 
tradesmen and labourers is a little below that of the general 
population, partly because test scores tend to decrease with age. 
Again when an occupational group was partially de-reserved, 
employers would naturally tend to hang on to their best and 
presumably most intelligent men and to release the duller ones for 
the Forces. Thus observed age differences were distorted by differ¬ 
ences in the occupational make-up of groups of recruits of different 
ages, and occupational differences were distorted by the age 
groups that happened to be called up during a certain period. 
Psychologists were, of course, unable to control these factors, or 
to hold them cotistant, as they would do in a laboratory experi¬ 
ment, hence it was almost impossible to disentangle their effects. 

Pre-war research, though often on a small scale, had established 
the following facts by taking special precautions to secure repre¬ 
sentative cross-sections of the population: 

, (1) General intelligence increases on the average at a steady rate 
up to about 12 years. The rate of increase then declines until 
a maximum is reached by about 16 years, though on some 
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tests there is little if any increase after 13 and on others rises 
have been reported even beyond 18-20*. 

(2) Between the twenties and sixties there is a progressive 
decline on tests of g involving abstract reasoning and speed 
of mental manipulation, though on other tests of what has 
been called “crystallised" intelligence, such as vocabulary 
and information, the level is better maintained (cf. Cattell, 
1943; Brody, 1944). 

(3) No one individual necessarily conforms closely to these 
average trends. "Longitudinal” studies of particular chil¬ 
dren often show great irregularities of mental growth, with 
spurts and plateaux attributable partly to emotional adjust¬ 
ment or maladjustment, partly to the stimulating or inhibit¬ 
ing effects of the child’s home and school environment (cf. 
Fleming, 1948). The same may well be true of adults, but 
longitudinal studies are much more difficult since an adult 
cannot be re-tested many times without his scores being 
affected by "sophistication,” and by his attitudes towards 
the investigation, e.g. growing hostility. 

(4) Educational attainments tend to be forgotten rapidly after 
leaving school, except in so far as they are used in daily life. 
Thus, Norris (1940) finds that scores on linguistic tests may 
rise till about 40 years, but arithmetic achievement declines 
in most persons other than clerks who practise arithmetic in 
their job. The performance of adults on intelligence tests, 
accorffing to Lorge (1945), is also affected by education 
beyond 14 years, though, as Garrett (1946) points out, 
results on a verbal test do not necessarily mean that intelli¬ 
gence itself alters. Adults with a university education, tested 
at 34 years, were about two years superior in mental age to 
others of the same intelligence level when aged 14 years who 
had received no secondary or university schooling. An 
investigation by Miles (1932) also appeared to show an 
earlier decline in intelligence among adults who had only 
had elementary schooling, and Cattell (1943) claims that 
superior occupational groups retain their intelligence better 
than lower ones. 

Several studies in the Forces helped to supplement ■ these 

* Dearborn and Rothney (1941) claim increases up to much later ages than 
16, but they are aaBuming an indefinite continuation of schooling. 
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conclusions. The abstract reasoning ability which is particularly 
affected by age is typified in the matrices test. Thus, it was some¬ 
what unfortunate that this test should be the one most widely 
applied in the Royal Navy, Ai-my and A.T.S. When it was first 
introduced the Navy happened to be recruiting men mostly aged 
30 and over, and the norms which were established then, and which 
could not readily be altered,, were extremely faulty when applied 
to later intakes consisting mostly of l8-year-olds. The S.G.s were 
chosen to divide the older population into 10, 20, 40, 20 and 10 per 
cent, groups, but with more representative recruits they yielded 
12 to 14 per cent, of S.G.ls and only 1 to 6 per cent, of S.G.Ss 
(cf. Table XIX)*. 

A typical set of correlations for nine tests with age was obtained 
in 1942 in a group of 678 Arthy recruits well spaced out from 18 to 
40 years; these are listed in Table XIV. All of them, it will be seen, 


Table XIV.— Correlations Between Test Scores and Ace Among Army 
' Recruits 


' Test 

Correlation 

Progressive Matrices. 

-•340 

10 Morse Aptitude . 

-•296 

16 Agility .. ,, . 

-•208 

4 Squares ,. ,, ,, ., ,, 

-■193 

2 Bennett Mechanical ,. ., 

-•182 

12 Clerical . 

-•173 

’ 3 Arithmetic. 

-•166 

S Assembly ,. .. ,, ,, ,, 

-■161 

17 Messages (Verbal). 

-•120 


are negative, even with tests such as arithmetic and messages which 
involve most education. This suggests that the older age groups 
were of inferior quality apart from age. But the relative order of 
correlation is none the less interesting, showing that such “v : ed" 
tests are least affected, Matrices and tests dependent on sensory 
and physical qualities most affected (the Assembly test being 
exceptional possibly because engineering experience tends to 
increase with age). It was also found that the decline was fairly 
small until 36 years or more. When all men of 21 and over were 
contrasted with the 20s and under the mean (tetrachoric) correla- 

* It is an interesting point that the S.G.ls in the young samples have never 
reached, as large proportions as would be expected from their superior mean 
scores. This indicates that, althougl) the proportion of S.G,43 and 6s increases 
rapidly with age from 18 to 30+, the proportion of S.G.ls only sinlcs very 
slowly, so confirming the, conclusions based on Table XVI on page 192, 
















effects of practice, age and other factors 191 

tioil for all tests was —‘ISV. When the 30s and over were con¬ 
trasted with 29s and under it was —149, but when 368 and over 
were contrasted with 34s and under the average coefficient rose 
to —'288, 

The A.T.S. differed from the Army in that recruits of, say, 
26 and over were often of better quality, drawn from higher-grade 
occupations, than the younger ones, since more of them were 
volunteers. All their correlations with age, therefore, tended to be 
more highly positive or less highly negative than those of men. In 
one representative sample of 200, they ranged from —lOOi for 
Matrices and —070 for Squares to -|-•147for Verbal (Test25), and 
-f-180 for Spelling. This —100 for Matrices is almost as cer- 
■ tainly too small, as the —340 (in Table XIV) is too high. Hence 
the correlation of —•238 obtained among 90,000 naval recruits, 
whose occupational make-up was held constant, may well be about 
right (cf. Table XV). 

Differential Decline with Age 

Since age changes differ on different tests it should be pointed 
out that any composite intelligence score, based on several tests of 
varied types (such as T2 or Summed S.G.) will have a somewhat 
different make-up at different ages. A man of 20 and another of 36 
may obtain the same Summed S.G., but the former is likely to 
score better on Matrices and Bennett, the latter on Verbal and 
Arithmetic. 

That the rate of decline in g (as measured by Matrices, not by 
verbal intelligence tests) is greater at lower intelligence levels was 
indicated by a survey of 90,000 naval recruits in 1942 (cf. Vernon, 
1947c). They were classified under twelve broad occupational 
headings, so that it was possible to maintain the same occupational 
distribution at all ages. Table XV shows the names of the groups 
and total numbers, together with the mean scores of 16- to 19-year- 
olds (average about IS-O) and the mean scores of 20- to 40-year-old9 
(average about 30-0). The last two columns show the decreases 
between the 16-17 and the 18-19 year groups, also between the 
16-19 year groups and those aged 30 and over. In the first of the 
latter columns the decline in the four most intelligent occupations 
is less than a quarter the decline in the three dullest occupations. 
In the last column the results are more irregular, possibly because 
of the small numbers available at later ages, nevertheless, the 
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Tabl$ XV.— Avehage Mathicbs Scores and Declines with Age in Twelve i 
O ccuPATioNAi. Groups 


Occupation 

No, 

Mean Scores at 

Decreases from 

18.0 

30.0 

o 

16+ to 
30+ 

Clerks 

11,603 

k' 9 

30-1 

H !!■ 

6-07 

Electtical Workers 

4,002 


37-3 

HTiB 

6-18 

Woodworkers ,. 

6,705 


36*2 


3-40 

Precision Workers 

11,087 


36-8 


6-43 

Sheet Metal Workers .. 

6,401 


36-1 

bttth 

3-72 

Retail Tradesmen 

0,373 

37-4 

34-7 

0-77 

3-80 

Machine Operators ., 

5,030 

36-7 

33-4 

0-42 

6-67 

Builders. 


30-4 

32-0 

0-77 

4-43 

Drivers .. 

6,764 

34-6 

31-6 

0-61 

4-66 

Mates. 

6,367 

30-6 

30-6 

0-86 

8-49 

Farm Workers ,. 

2,400 

33-4 

30-6 

1-39 

6-17 

Labourers 

13,406 

33-0 

28-6 

1-67 

6-60 

All . 

89,764 

37-4 

34-2 

0-72 

6-28 


dedine is more than, twice as great among labourers and mates as 
among woodworkers and sheet metal workers. The interpretation 
of these figures is dubious both because the 16-17-year-olds consist 
of volunteers for the Navy, whereas the 18-year and later groups 
consist mostly of less intelligent conscripts, also because of 
unknown effects of de-reservation policy. It might be rash to con¬ 
clude that the general decline in g starts as early as 17. There is, 
however, no doubt of the statistical significance of differential 
rates of decline in different occupations, and they appear to fit the 
hypothesis of more rapid decline among men who make least “use 
of their brains.” 

The same point is brought out in another way in Table XVI, 
which shows the percentages of successive age groups obtaining 
very high Matrices scores—60-64 and 6^60, together with the 

• Table XVI.— PsacENTAOE Matrices Distributions and Mean Scores of 
Different Age Groups 


Score 

Age (years) 


16-17 

18-19 

20-24 

26-29 

30-34 

36-39 

40+ 

66-60 .. 

0-33 


0-32 

0-36 

0-16 

0-04 



3-72 

3-71 

4-02 

2-66 

1-69 

1-29 


36-49 .. 

66-85 

62-64 

69-30 

62-00 

48-60 

42-80 

33-13 


30-10 

33-20 

36-36 

44-11 

40-66 

66-87 

66-18 

Nos. .. 

32,412 


6,166 

4,060 

6,361 


3,748 

Means 

37-68 

87-13 

36-43 

84-60 

33-74 


30-30 
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numbers and mean scores for all occupations combined. In spite 
of the steady decrease in the mean with age and in the proportion 
of above average scorers (from 70 per cent, at 16-17 to 34 per cent. 
it 40 +), the proportion scoring 66 and over remains steady (or 
even rises) till 29 years, and only then starts to drop rapidly. Again 
the proportion scoring 60-64 remains steady (or rises) to 24 years, 
and then drops. While there may be some alternative explanation, 
it certainly looks as if the most intelligent retain their intelligence 
longest. 

Changes with Age from 14 to 18-|- 
In an investigation of naval artificer training establishments, 
where boys enter at 14 and are trained as skilled tradesmen till 18, 
S.P. Tests 1-4 and several supplementary tests were given to all 
boys, including 300 of average age IBJ and 250 of average age 18^. 
Although this population is far from representative of the general 
average, being highly selected educationally, it has the advantage 
of remaining stable in composition throughout the four years since 
(unlike the ordinary secondary school) scarcely any boys are 
eliminated once they have started the course. The gains in scores 
from 16 to 18 listed in Table XVII have been converted into a per¬ 
centage scale to make them comparable for all tests. There is a con- 


Table XVII.— ^Percentage Increases or Decreases in Test Scores Between 
14+ AND 18+ Among ^tificbr Affrentices and Among Bo'IS Leaving 

School at 14 


' Test 

Gaifts or Losses Among 

. 

Artificers 

I4l^. Leavers 

1 Abstraction 

+4-8 

+ 1-0 

2 Bennett Mechanical 

+ 10'0 

+ 6-7 

3a Arithmetic 

+ 6-7 

-4-2 

3b Mathematics 

+6'6 

-9-0 

4 Squares ., 

+9-9 

+ B'3 

T2 . 

+ 12-4. 

+1-9 

97 Memory for Designs 

+6-6 

— 

117 Mechanical Information 

+ 14'2 

— 

Electrical Information .. 

+ 21'0 

-- 

110 Mechanical Models 

+4-0 

— 


siderable improvement throughout, and it is almost as great in the 
purest g test (Test 1) as in mathematics—^that is a school subject 
in which boys receive a fair amount of instmction. As one would 
expect it is greatest in the information tests (117B, M, and 2) on 
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account of the training, but it is very large also in Test 4, that is m 
spatial ability which is not directly trained at all. 

During 1946-7 several S.P. tests were applied by one of the 
writers and his students at the University of Glasgow to large 
groups of school children. One sample* included all the available 
14-year-old boys leaving school in January; 1947, totalling some 
1,200. Naturally this was a rather inferior sample, but their scores 
on the T2 tests were compared, not with the standards for naval 
recruits in general, but with the scores of a group of 206 naval 
recruits who left Glasgow schools at the age of 14 in 1938 and who 
entered the Navy in 1942. The percentage increases or decreases 
in this group over the 14-year-boys also appear in Table XVII and 
.are very different from those found among boys of superior intel¬ 
ligence whose education was continued. There is only a negligible 
increase in the all-round ability measured by T2 or in the g 
measured by Test 1. On educational tests there is a serious loss. 
On the other hand there is a considerable rise in mechanical and 
spatial ability, although it is unlikely that many of the 18-year 
recruits had received any systematic technical or trade training. 
It would appear then that k : m abilities ris? with or without 
schooling from 14 to 18, that v : ed abilities drop and g remains 
fairly constant, unless they are stimulated by secondary schooling. 
But the constancy in all-round ability measured by a battery such 
as T2 is specious, and occurs merely because the gain on the 
practical side roughly balances the loss on the educational side. 

In other post-war enquiries, approximate age norms were estab¬ 
lished for a version of Test 1 (Abstraction), for a spelling test, for 
Army Arithmetic and for the Kohs-Misselbrook blocks test. The 
average adult standards, as obtained in the Forces, were found to 
be reached by children of 13, 12^, 13 and 16J years respectively. 
That is to say, the average 18-year recruit may actually be lower on 
the first three, and particularly on spelling, than he was at the age 
of ] 4, but—just as was observed with Test 4—^he is better than the 
14-year-old on a performance test which involves ^-factor. Further 
analysis of the Abstraction, Spelling and Arithmetic results showed 
that the standards at 14 and 18 are just about the same at the top 

* This testing wns part of a survey undertaken by the Social and Economic 
Research De^artment.x>f Glasgow University. The boys were scattered over 
60 schools, and.the writer is grateful to Miss C. McCallum and the stall of the 
Glasgow Corporation Child Guidance Clinics for arranging and carrying out 
the testing and scoring. ■ 
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end of the scale. The best 10 per cent, of adult recruits do as well 
as the best 10 per cent, of normal 14-year-olds. But there is a pro¬ 
gressively greater decline at the lower end. The bottom 10 per 
cent, appear to lose 1| years on Abstraction, two years on Arithmetic 
and three years on Spelling. At 14 years the 10 per cent, poorest in 
spelling reached a maximum level equivalent to the average 11 -year- 
old; whereas among recruits the corresponding level is only 8 years. 
It is little wonder that there were so many complaints of semi¬ 
illiteracy and appailling spelling among recruits when dull adples- 
cents lose so rapidly the sldlls in which they were drilled till 14. 
The better retention among brighter men, who receive further 
schooling or who use their skills in their jobs, confirms our con¬ 
clusion as to the differential decline in abilities. 

We see then that the average performance of a group of adoles- 
I cents or adults on'psychological tests varies greatly with the type 
I ofaBility tested, with age, and with any schooling or other training 
the group has received, or forgotten. This complicates tremend¬ 
ously the establishment of satisfactory test norms. A further 
corollary, since teats of g appear to be affected in much the same 
way as educational knowledge by use or disuse, is that psycho¬ 
logists and educationists should investigate just what types of 
adolescent and adult education, and of occupational and avoca- 
tional experience, most effectively stimulate intelligence. Not only 
can it be ^sed during the ’teens by schooling, but also the inevit¬ 
able decline in adulthood can probably be retarded. 

Occupational and Geographical Differences 

The striking differences on T2 and other tests between men in 
various Service employments are described in the next Chapter. 
Few large-scale studies of civilian occupational differences were 
carried out, apart from the one whose results are listed in Table 
XV. This was the only one, too, where the age distribution was 
held constant in all groups. It may be observe^ that the range of 
scores from clerks at the top to labourers at the bottom is rather 
small, corresponding in terms of I.Q. to a range of about 110 to 
90*. Verbal tests such as Army Alpha and Cattell’s Scale III give 
a much wider range than does Matrices. This bears out our 

* This assumes a Statidard Deviation of 16. Cattell’s Scale III has a S.D. of 
26. Had we assumed this figure our range would be perhaps 116 to S4. But 
Cattell’s range is stiU much wider, running from clerks 'with l.Q. 127 to 
padcers and sorters l.Q. 78. 
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contention (in the next Chapter) that occupational suitability, and, 
therefore, occupational level, depends on v : ed factor as well as 
qng. . ' 

The only regions of Britain studied were the nine very hetero¬ 
geneous areas into which naval recruiting centres are grouped.' 
Small yet significant differences are indicated by the mean scores 
listed in Table XVIII. The first column of means gives the 


Table XVIII.— Mean Mateices Scoees op Naval Receuits Feom Dippebent 

Areas 


Area 

No. 

Mean 

(i) 

Scores 

(if) 

London, mdudine Kent, E. Anplia and Northamp- 




ton .. .. .. . 

22,376 

37-6 

37*3 

Manchester, including Yorkshire 

12,626 

6,496 

37-2 

38*7 

Derby, including Lincoln and Nottingham 

36-5 

36*6 

Southampton, including Sussex, C^ord, and 




Dorset 

7,706 

36-3 

36*8 

Newcastle, including Cumberland, Westmorland 



' 

and Durham 

6,264 

36-0 

36-2 

Livertool, including W. Lancashire and N. Wales 

7,690 

36-8 . 

36-0 

Brnmnsham, including Gloucester, Hereford, 

6,397 



Worceater, Shropshire and Mid-Walea 

36-6 

36-6 

Bristol, including Wilts, Somerset, Devon, Com- 



vrall and S. Wales 

7,988 

35-0 

36-2 

Glasgow, including the whole of Scotland 

11,223 

34*8 

36*3 


obtained figures, corrected only for age differences. But occupation 
and area are closely linked. Glasgow is low partly because it con¬ 
tains the largest proportion of labourers, Manchester high because 
it has an excess of clerks and electricians. In the last column, 
therefore, the occupational distribution has been equalised in all 
areas. Manchester then drops a little and Glasgow rises, but the 
alterations are only slight. 

Naturally a bigger range of differences would be expected 
between more homogeneous areas. I^is absurd, to group, e.g. 

■ Scottish High landers with Glasgow Irish, and to make no separa¬ 
tion belween predominantly urban and rural regions. Nevertheless, 
even these crude figures are of interest in suggesting national 
differences, though considerable caution is necessary since we do 
not know how representative are the samples. Welsh and Irish 
recruits would occur chiefly in Liverpool, Birmingha m, Bristol 
and Glasgow areas, and it is noticeable that these are at the 
bottom of the list. 

Another comparison was afforded by the survey of Glasgow 14- 
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year school leavers, though these groups, are not, of course, typical 
of the whole 14-year population. The mean T2 for. 400 boys at 
Roman Catholic schools, i.e. mostly of Irish descent, was 56-0, and 
for 771 boys in other schools 64*3. This corresponds to a difference 
in average I.Q. of about 7 points. 

Service Differences 

The belief was widely held that the R.A.F. got most of the 
“cream” of recruits, that the Navy also had more than its share, 
while the Army had to put up chiefly with below-average men. 
This could not be directly proved or disproved since no one test 
was taken by all three Services. However, a fairly trustworthy 
“bridge” was built up as follows. The G.V.K. tests were given to 
a group of 562 naval recruits, half before and half after taking the 
T2 testa. (The two batteries gave an inter-correlation of ■792.) 
Allowance could thus be made for practice effects, and correspond¬ 
ing percentile levels on the batteries were found. Another conver¬ 
sion table between Matrices and T2 was constructed (their inter¬ 
correlation in an unselected group of naval candidates being 
estimated as *80). Now Matrices had been given to all Army and 
A.T.S. recruits in 1942 as well as in the Navy. Table XIX shows 
the percentage S.G. distribution on Matrices for some 100,000 
Army recruits entering during July-November, 1942, for an 
equally large group of accepted naval candidates during January- 
September, 1942, for 3,769 ordinary seamen within the same 
period, and for A.T.S. intakes during the whole of the same 
year*. It will be seen that the intakes are closely similar in numbers 


Table XIX.— ^Matrices Distributions During 1942 in the Army, Navy 

AND A.T.S. 


S.G. 

Score 

Army 

Naval ■ 
Acceptances 

Seamen 

A.T.S. 

1 

46-1- 

per cent. 
12-3 

per cent. 
14-6 

per cent. 
11-9 

per cent. 
14-3 

2 

40-1- 

23-6 

27-9 

30-6 

27-6 

3 

29+ 

42-5 

44-1 

40-8 

39-3 

4 

20+ 

16-2 

12-3 

14-0 

17-6 

6 

19- 

6-4 

1-2 

2-7 

V3 


* These are Army norms; unfortunately naval and A.T.S. norms dlifered by 
1 orSpoints. SinceS.G.s alone were recorded at Recruiting Centres, not scores, 
only the Army and the O.S. figures are the percentages actually observed. The 
naval and A.T.S. figures have been converted to Airny norms, and ore unlikely 
to be inaccurate by more than 0.6 per cent. 
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of S.G.ls, but that it is true that the Army received a much larger 
proportion of S.G.6s or very dull recruits. In terms of scores the 
mean difference between Army and Navy amounted at most to 
2 points on Matrices at this period, or 6 points on T2. 

- We can probably allow that the Army intakes were slightly 
inferior to Ae population as a whole, and that naval and A.T.S. 
ones were slightly superior. Naval norms were, however, mostly 
based on ordinary seamen, not on recruits in general. This meant 
that some of the brightest recruits, entered as radio mechanics, 
telegraphists, etc., and some of the dullest entered as stokers and 
cooks, were omitted. Table XIX shows that seamen were of slightly 
lower quality than naval acceptances, though better than Army 
recruits, and it is a reasonable guess that, in average score at least, 
they were very close to the norm for the population as a whole. 
Probably, however, they were more restricted in range than the 
general population, both because very low Matrices scorers were 
rejected at recruiting centres, and because of the omissions just 
mentioned. 

Table XX gives the best estimates that could be made of T2 
percentiles in R;A.F., Navy and Army. Being based on moderate- 


Table XX.— T2 Levels m R..A.F.,' Navy and Army 


Service 

Date 

No. 

Percentiles 

90th 

Median 

10th 

R.A.F. Air Crew .. 


1942 

1,141 

141 

114 

86 

R.A.F. Ground Staff 


1942 

6,000 

116 

81 

64 

Navy O.S. .. 


1942 

1,000 

108 

76 

46 

Navy O.S. 


1944 

3,384 

132 

100 

68 

Army recruits 


1942 

678 

100 

72 ' 

36 

Army recruits 

•• 

1946 

1,000 

112 

81 

48 


sized groups they are less representative than the figures in the 
previous Table, and the 90th and 10th percentiles are naturally 
less reliable than the 60th. The table bears out our conclusion that 
in 1942 Army recruits were only slightly poorer than naval ones, 
and it may be seen that R.A.F. ground staff were but little superior 
to seamen. By 1944^6, however, the Navy was able to reject many 
more candidates at recruiting centres, whereas the Army could 
only raise its standards to a small extent. These figures remain 
fairly representative of peace-time intakes. R.A.F. air crew, in' 
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contrast with ground staffs, were clearly very superior, and pro¬ 
vided some substance for the Army’s complaint that the R.A.F. 
and Navy got the largest share of high-grade material. 

Effects of Pre-Service Training 

Great stress was laid by the Services on the value of pre-Service 
training'in the Sea Cadet Corps, Junior or Senior Training Corps 
or Army Cadet Force and the Air Training Corps. Several 
follow-up studies in the Royal Navy showed that such recruits, 
together with members of Scouts and Boys’ Brigade—^who con¬ 
stituted a quarter to a third of all intakes—^were superior as seamen, 
in Coastal Forces, as radio and air mechanics and air fitters, and as 
telegraphists; but it was doubtful how far this was due to their 
possessing better g and education. Some 4,600 recruits in H.M.S. 
Royal Arthur were studied in 1944, including nearly a thousand 
Sea Cadets and a thousand A.T.C. members, and smaller groups 
from other organisations. The following rank order of average 
ability was obtained on T2 and on Test 3b (Mathematics) and 
educational level: 

J.T.C. and S.T.C. ' 

Scouts and Sea Scouts. 

A.T.C. 

Members of no organisation. 

Sea Cadets. 

Boys’ Brigade and Church Lads' Brigade. 

Army Cadets. 

It should be remembered that the best A.T.C. members were 
likely to join the R.A.F. and the best Army Cadets the Army, also 
that the general quality of entries was high at this time. In 1942-3 
Sea Cadets would certainly have been superior to average intakes. 
Tabulations were made of the number of recruits in each group 
'recommended by P.S.O.s as suitable for the main categories— 
officer candidates, mechanics, writers and coders, communications, 
seamen and stokers, cooks and stewards, etc. It was found that 
Sea Cadets provided rather more trainees for the most active 
branches—^namely, officer candidates and seamen, though fewer 
for the specialist branches than did non-members, even when 
allowance was made for lower intelligence and education. The 
A.T.C., when ability was held constant,, provided an excess only 
in the communications branches. Ex-Scouts 3 rielded at least as 
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large a proportion of officer candidates and specialists as any of the 
pre-Service organisations. Length of membership of the organisa¬ 
tions was found to have a favourable, though only a very slight, 
effect on the yield of high-grade material. The conclusion reached 
was that pre-Service organisation membership as such had scarcely 
any effect, except for morse and mathematical training given in the 
A.T.C. The good showing of members is almost entirely attribut¬ 
able to the intellectual, educational and other qualities of those who 
join the organisations. 

Similar results were obtained in the Army when some 3,600 
members of pre-Service organisations, and 6,600 non-members 
were studied. On test results, education, and ratings by P.S.O.s for 
officer quality and leadership, the rank order of organisations was 
Junior Training Corps, A.T.C., Army Cadet Force, non-members, 
and Sea Cadets. The latter now come at the bottom since the best 
cadets naturally prefer the Navy. While 12J per cent, of members 
and only 2^ per cent, of non-members were earmarked as potential 
officers, the difference was almost wholly attributable to the higher 
education and intelligence of the former—as shown by Table XXL 


Table XXL —^Analysis of Variance in Potential Officer 
Recommendations Attributable to Several Factors 



Sum of squares 
per cent. 

Difieiences in educational standard 

10-88 

Additional variance due to membership of a pre-Service organisa¬ 
tion .. 

0.96 

Additional variance between different organisations .. 

3-14 

Individual differences (residual) .. 

76-03 


Follow-up to later stages in the naval or military careers of these 
recruits would be desirable, but was not found practicable. In a 
smaller, but carefully controlled, experiment, fifty-five Army 
Cadets who had obtained the War Certificate A for proficiency 
were matched with non-members of equivalent education and 
intelligence, and were subjected to special training. Of the former 
group 83 per cent., of the latter 60 per cent., were able to accom¬ 
plish their primary Army training in four weeks instead of the 
usual six, and were superior also in Corps training which was 
reduced from ten to six weeks. The Certificate recruits were better 
in rifle shooting, but showed little difference on light machin e-gun, 
i,e. a comparatively unfamiliar weapon or on physical training. 
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Three months later over half the men had been before W.O.S.B.S, 
and the superiority of the Certificate holders was much less marked 
' as shown by the correlations in Table XXII. On the other hand, 
the weighted selection test scores used for picking potential officers 
(the O.R.1 Index, cf. p. 46) correlated almost as well with 
W.O.S.B. as with primary training results. 


Table XXII.— Relative Pbedictivb Valxjb of Pre-Service Training and of 

Selection Tests 



Correlations zuith 


Primary 

Training 

WOSB Pasf 
or Fail 

Fossession of Certificate A v. no pre-Service 
experience .. '. 

4--47 

+ •24 

Sdection test battery (O.R.l). 

' +-38 

+ •33 


Effects of Physical Condition 

Some 260 Army recruits with very poor physique who took a 
two months’ course at Army physicd development centres were 
tested before and after. (Results on the Matrices test alone were 
available for 648 cases.) The gain in scores to be expected on 
re-testing was known from an earlier investigation into the relia¬ 
bility of S.P. tests. In every test there was a greater gain than mere 
practice effect, though it was often small and statistically non¬ 
significant. The results are expressed on a common percentage 
scale in Table XXIII. The increase in Agility would be expected 


Table XXIH.—Percentage Gain in Test Scores Attributable to 
Attendance at P.D.C.s , 


Test 

Per cent, gain above 
practice effect 

Probability of g<Un 
being due to chance 

16 Agility. 

6-5 

<-001 

Matrices. 

2-8 


26 Verbal .. 

2-4 


8 Assembly 

2-2 

<•06 >-01 

3 Arithmetic 

0-6 

> •06 

2 Mechanical 

0-6 

n 

and conforms with an improvement in medical category, height 


and weight. Attached to the centres are picked educational ser¬ 
geants who organise' numerous “brains trust” periods, spelling 
bees, and the like. This may accoimt for the rise on the Verbal test. 
The increase on Matrices is less readily explicable, but suggests 




















202 PERSONNEL SELECTION IN THE BRITISH FORCES 

that the test involves some kind of mental alertness linked to 
physical alertness, in addition to g. Note, however, that, although 
highly significant, it only amormts on the average to 1J points of 
score. 

Several enquiries were made in the A.T.S. into the effects of 
menstruation. The day within the menstrual cycle on which a 
battery of eight selection tests was taken for a second time was 
ascertained by a medical officer from 1,335 auxiliaries, all of whom 
claimed a normal or twent^-eight-day cycle (some 270 others 
admitting short, long or variable qrcles were excluded). They were 
classified into four “phase” groups: 

(i) From four days before to four days after the onset of 
menstruation. 

(ii) From fifth to tenth day. 

(iii) Ovulation phase—eleventh to eighteenth day. 

(iv) From nineteentli to twenty-fourth day. 

At the original testing the menstrual days were unknown and could 
be assumed to be randomly distributed. Comparisons of test aqd 
re-test Matricies scores yielded the analysis of variance shown in 
Table XXIV, There appeared to be slight differences between 


Table XXIV.— ^Analysis of Variancb Dub to Menstrual Phasb 



Sum of Squares 
per cent. 

P. 

Toial Variance on Re-testing 

100 


Accounted for by regression of re-test on test 

62-328 

— 

Accounted for by deference between phases 
Accounted for by differences between days 

•060 

>•05 

in the same phase . 

1*066 

<■06 >*01 


observed and expected re-test scores on some days, but they were 
irregular and were not associated with any particular menstrual 
phase. Phase as such had no demonstrable effect. A similar analysis 
of the Clerical test (S.P.12) gave negative results. Although very 
slightly lower scores were obtained on most of the eight tests by 
the period group (Phase i), in no case was it significant, even with 
this very large number of cases. 

In a further investigation, 1,000 auxiliaries were asked to state 
both at test and re-test if they felt unable to do themselves justice, 
and if so on what grounds. The most common complaints were 
colds 10*4 per cent., menstruation 3*46 per cent., and chilblains 
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2'1 per cent. In no case did the scores of those claiming unfitness 
on either occasion thov significant drops. Indeed, the most con¬ 
sistent difference was a rise in the practical Mec test scores (SI*. 
24] among those mdetgoing menstruation. 

Among some 3,000 auxiliaries enquiries were made at a medical 
interview and 24'4 per cent, reported some pain during their 
periods.Astrong association was Iscovered between this ind^^ 
of pain and type of A.T.S. or dvilian employment. It was signifi¬ 
cantly higher among women in strenuous or outdoor, than in 
sedentary or indoor, jobs. No sigmficant relationship to educa¬ 
tional standard was observed, and though pain occurred slightly 
more often among women of below average intelligence on 
lU fatfife, this was probably due to the association both of pain and 
of low intelligence with low medical tategory. 



CHAPTER XII 


GENERAL SURVEY OF THE VALUE OF VOCATIONAL 

TESTS 

Abstract —^An outline of the main results of pre-war investigations 
indicates that teats of general intelligence, of mechanical, spatial 
and other abilities, are of considerable value in assessing occupa¬ 
tional suitability. Mechanical-spatial tests are certainly superior to 
verbal and educational ones in the selection of adolescents for 
engineering training, though giving less consistent results with 
adults. Paper-and-pencil group tests of the miniature situation 
type and ^agnostic testa show promise in a variety of educational 
and vocational fields. Though many psychologists consider that 
such general teats are less appropriate than specialised practical 
tests for particular jobs, the evidence is inconclusive, and the choice 
depends largely on whether the tests are to be used primarily for 
classification and guidance or for selection. 

Very extensive follow-up investigations were carried out in the 
Services. These demonstrated first the value of a standard battery 
of group tests in a large variety of jobs, and secondly, the rather 
smdl extent to which more specialised tests helped in difiFerentiat- 
ing between different jobs. Mathematical and verbal tests tended 
to surpass mechanical and spatial tests even in mechanical and 
practical occupations. This tendency persisted when operational, 
as conbasted with training, criteria were studied, but would not 
necessarily be true under industrial-^-as distinct from Service— 
conditions. 


So great is the volume of investigations into the value of different 
types of tests for predicting educational or vocational success, that 
we can attempt to give here only a very general summary of the 
literature before turning to the main results achieved in the Forces. 

The usefulness of group intelligence tests of the ordinary verbal 
type in relation to school and university work is well established*, 

•A useful account of investigations at the University level is given by 
Eyeenck (1947b). 
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but it is generally considered that they have little bearing on voca¬ 
tions other than those which involve manipulation of words and 
numbers. Thus they give moderate correlations with proficiency 
among clerical workers, but have been shown in various studies to 
be unrelated to success among business executives, telegraphists, 
compositors, motormen, toolmakers, certain types of assembly 
workers, packers, etc. In jobs of a routine mechanical kind high 
intelligence and education may sometimes even be disadvantage¬ 
ous, leading to dissatisfaction and increased labour turnover. Not 
'.^all these studies will bear close scrutiny, however; the tests used 
were sometimes inappropriate, the criteria far from reliable, and 
often no attention was paid to the selectivity of the groups con¬ 
cerned. Other investigations such as those with Army Alpha and 
Cattell’s Scale III (1934), have shown very considerable differ¬ 
entiation in average intelligence between high- and low-grade 
occupations. Though there is much overlapping, for example, some 
machine operators oeing more intelligent than some school¬ 
teachers, yet it must follow that intelligence has a bearing oh 
success as a teacher. But as such groups as teachers are usually 
highly selected in respect of some factor like education, which is 
itself highly correlated with intelligence, any correlations between 
teaching ability and intelligence are thereby greatly reduced. 

Actually there is quite a lot of evidence of positive correlation 
between intelligence as measured by suitable tests, or educational 
standard shown by college grades, and success in professional and 
administrative jobs. Jenkins (1947) summarises studies of “leaders” 
in industry and other fields. The American Army General Classifi¬ 
cation Test gave moderate correlations with the achievement of 
commissions by officer candidates, though none at all with assess¬ 
ments of efficiency in battle. Barr (1946) and his collaborators 
found that when teaching efficiency is objectively measured by the 
progress made by the pupils, intelligence is one of the most pre¬ 
dictive factors. In this country Heim (1947) has obtained promising 
results with her high-level A.H.6 test among industrial executives 
and B.B.C. engineers. Farmer and Chambers (1936) applied Group 
Test 33—^the N.I.I.P. verbal'intelligence test, the Cube Construc¬ 
tion performance test, and various mechanical and hand-eye co¬ 
ordination teats to numerous groups of engineering workers. They 
obtained correlations up to about -4 between the intelligence tests 
and proficiency in the more highly skilled, though not in the 
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simpler, jobs. In Holliday’s (1943) follow-up of engineering 
apprentices, Group Test 33 correlated as highly as his spatial and 
mechanical tests with instructors’ assessments, though less well 
among trade apprentices. He suggests, however, that boys who 
do.better on intelligence than on mechanical tests impress their 
assessors as being “bright,” but do not make as good craftsmen. 
Both he and Farmer provide some evidence that tests are more 
predictive of proficiency' after several years than they are of success 
in the early stages of training. 

Nothing appears to have been published on the vocational value 
of non-verbal g tests. They are certainly less useful than verbal 
tests in relation to primary school work, but seem to be superior to 
them in predicting mathematical and scientific achievement at 
advanced secondary and university levels. Performance tests such 
as iVIirmesota, Kent-Shakow, Moorrees and Oakley Formboards, 
Cube Construction and O’Connor’s Wiggly Blocks test are widely 
used in industry and in vocational guidance. There is, however, 
remarkably little evidence of their validity. In Rodger’s (1937) 
study of several trades taught to Borstal youths. Cube Construction 
was the best teat. Psychologists have largely employed such tests 
for the qualitative indications they yield of methods of work, but 
these, too, are in need of validation. 

Mechanical assembly tests such as Stenquist’s appear to have 
been used only on a small scale because of the time required for 
individual testing. No follow-up evidence is given on the 14,000 
American Army recruits to whom an assembly test was applied in 
1917-18, Fair results among boys and apprentices were found by 
the Minnesota investigators (Paterson et al„ (1930), Rodger (1937), 
Earle and Macrae (1929), Farmer and Chambers (1936) and others), 
and high validities have been claimed among certain types of metal 
workers and cotton-mill machine fixers. The Purdue Mechanical 
Assembly Test (Tiffin, 1946) is a new and improved version which 
has been validated among machinists. Group tests based on deduc¬ 
tions about working mechanical models have been devised both by 
Cox and Vincent. These appear to be effective substitutes for 
assembly tests among adolescents, according to Hunt and Smith 
(1946), Holliday (1943), and Shuttleworth (1942), but they, too, 
have seldom been tried out on 4dult workers. 

Group tests of spatial ability or k which have been applied voca¬ 
tionally include the N.LI.P. Form Relations and Memory for 
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Designs Tests, the Minnesota Paper Formboard, Thnrstone’s 
space tests, Group Test 80, Squares and Figure Construction- (cf, 
pp, 230-230). Many of the investigators just cited provide evidence 
of their usefulness in selection for technical education and trade 
apprenticeship. Certainly, along with models or assembly tests, 
they are superior to scholastic examinations, and probably to 
verbal intelligence tests, in picking boys for “practical” careers, 
though how far they -work at as early an age as 11—12 is more doubt¬ 
ful. Holzinger and Swineford (1940) similarly report that they 
correlate with shop work and mechanical drawing, but not with 
geometry. 

Several mechanical tests are based on pictures of machines, tools 
or mechanical situations. Cox’s (1928) Diagrams, Explanations and 
Completion testa set gut to measure comprehension or mechanical 
understanding rather than experience, while O’Rourke’s so-called 
Mechanical Aptitude test (which also includes verbal items) is 
clearly a test of mechanical information. Stenquist’s test -with the 
same name and Bennett’s Mechanical Comprehension test are 
intermediate, the latter also containing problems from statics and 
dynamics, heat, light and sound. There is considerable confusion 
as to what these tests measure, and their titles are often misleading. 
However, as pointed out above (p. 170), the distinction between 
aptitude and attainment is of little practical importance. Cox’s 
tests are considered useful by the Birmingham investigators; 
O’Rourke’s was widely used by the Tennessee Valley Authority, 
and the Bennett test was successfiil among machine-tool operators, 
But there is little further evidence of their validity from civilian 
investigations. Cunnidgham (1943), describing the war-time appli¬ 
cations of psychological methods in Australia, mentions a study 
of twenty tests for selecting fitter and turner trainees in munitions 
industries. The most valuable were intelligence, technical infor¬ 
mation, and mathematics tests, Cox’s Diagrams, a paper formboard 
and other k tests. 

All the above tests are general ones—^tests of general intelligence, 
spatial, mechanical or practical abilities. Most psychologists 
assume that in selection for particular jobs, tests must be more 
specifically designed on the basis of a careful job analysis. Drake 
(1040), for instance, as a result of extensive experience in a large 
industrial firm, regards the application of paper-and-pencil tests 
as a waste of time and money. He himself developed dexterity, 
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co-ordiiiation and othec tests directly related to groups of similar 
jobs. On the other hand Pond (1941), in the course of many years^, 
work in a metal manufacturing company employing 6,000 people, 
appeared to obtain good results with a verbal intelligence test, 
paper formboard, the MacQuarrie paper-and-pencil mechanical 
ability test, O’Connor Wiggly Blocks and Kent-Shakow formboard. 
Similarly the personnel department at Rowntrees uses a limited 
battery for all entrants. 

Descriptions of typical analytic, analogous and work-sample tests 
are given in any book on industrial psychology. Surveys of recent 
work on selection tests have been published by Long and Lawshe 
(1947) and Hardtke (1946), the latter listing over a hundred refer¬ 
ences on metal working occupations alone. It is very difficult to 
evaluate all this work, also the—even more ambitious—applica¬ 
tions of vocational testing in pre-war Germany. The authors just 
mentioned admit that a large proportion of the publications are 
merely descriptive, containing no convincing validatory evidence, 
and that their primary object seems to be to "sell” tests to indus¬ 
trialists. In most instances, too, the samples of workers studied are 
small, the criteria ill-defined and poor in reliability. In the present 
writers’, opinion the results of no single experiment, even on a 
hundred cases, should be accepted at its face value. To be accept¬ 
able, a battery of selection tests should be used over a considerable 
period and followed up in such a manner as to prove its worth at 
least twice. For when a number of tests are tried out on a smallish 
group, some of the coefficients are almost sure to appear promising; 
yet when repeated on a larger scale, possibly under slightly 
changed conditions of work or with different types of workers, the 
results may alter entirely. We certainly do not wish to decry all 
the work done on specialised tests, even if much of it has little 
permanent value. The decision between such tests on the one 
hand and reliance on general tests -t- interview on the other hand, 
obviously depends chiefly on whether the investigator is more 
interested in selection or in classification and guidance. But we can 
say that the evidence regarding the practicability and validity of 
specialised. tests is hardly convincing enough to dissuade the 
psychologist from making the best possible nse of the simpler tech¬ 
niques, and resorting to the more complex only in so far as they 
can be proved to add appreciably to the accuracy of his predictions. 

The bulk of vocational testing has been in manual occupations, 
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and no clear trends are discernible in other fields. In motor driving, 
in telegraphy and in music the most successful tests appear to be 
of the work-sample or miniature situation type. Analytic tests such 
as Seashore’s tests of musical talent have been used, but generally 
show poor validities. According to Johnson (1046), a simple com¬ 
bination of relevant biographical items is as effective in picking 
motor drivers prone to accidents as are any of the proposed 
apparatus tests of co-ordination, reaction time and the like. Useful 
tests of visual perception for range- and height-finders were 
developed by American psychologists during the war, but the 
application of tests of dark adaptation in eliminating men who 
could not see effectively at night appears to have been a complete 
failure. 

Tests of medical, scientific, legal and nursing aptitudes have 
usually been validated against examinations in these subjects, not 
against skill on the job. Thus bothin these fields, and in the selec¬ 
tion of clerical workers, paper-and-pencil tests either of general or 
of more specialised abilities are found quite useful. For example, 
the efficiency of administrative civil servants was successfully pre¬ 
dicted by a battery consisting of tests of general intelligence, 
knowledge of current events, knowledge of the Civil Service, inter¬ 
pretation of charts and tables, and judgment of administrative 
situations (Mandell and Adkins, 1946). A promising line has been 
opened up by recent developments in diagnostic testing. We know 
that different types of mental patient can to some be dis¬ 

tinguished by projection, sorting, abstraction and other tests; and 
Thurstone’s (1944) factorial analysis of perceptual tests has 
revealed “qualities of mind” over and above g, v, and ft, which 
appear to discriminate such groups as administrators and student 
leaders. Munroe (1945) has developed a reasonably objective 
method of scoring the Rorschach inkblots* to 3 neld a measure of 
emotional instability which correlates well with failure at a uni¬ 
versity among students of good ability. The same test is claimed to 
differentiate mechanics from youths more suited to other types of 
job (Piotrowski et ah, 1944). Thus, there is reason to hope that the 
future will see the isolation and measurement of some of the special 
qualities possessed by successful officers „and leaders of men, by 

* Here the Rorschach test was given to groups, but as nearly as possible in 
the original form. The further modification where multiple-choice responses 
are provided, so that scoring is wholly objective, is much less successful (but 
cf, pp, 266). 
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executives and administrators, by teachers, research workers, 
salesmen, interviewers, and so forth. 

At the moment, however, while many tests have been tried out 
in these occupations, there are none that can be confidently recom¬ 
mended. Indeed, it is fair to conclude that successful guidance or 
selection of such persons has been based chiefly on the interview 
and background questioiuiaire, supplemented by tests of general 
intelligence, and personality and interest inventories. 


Follow-up of Tests in the Forces 
During 1942-6 there yere seventy-six follow-up investigations 
in the Navy alone, covering over 31,000 recruits, in the course of 
which some criterion of proficiency was correlated usually with six 
or more selection testa, often with other data such as source or 


mode of entry, age, education and civilian occupation; sometimes 
also with numerous items such as interests or leadership experience 
taken from the recruits’ biographical questionnaires. The occupa¬ 
tions mcluded seamen, gunnery and torpedo rates, radar and asdic 
operators, stokers, stewards, cooks, photographers, safety equip¬ 
ment ratings, cinema projectionists, supply assistants, writers, 
R.N.V.R, officer cadets, artificer apprentices, electrical artificers, 
electrical and radio mechanics, motor, engine-room and ordnance 
mechanics, wiremen, air fitters and mechanics, telegraphists, signal¬ 
men, telegraphist air gunners. Fleet Air Arrii pilots and obseirvers, 
naval instructors and W.R.N.S. peilonnel selection staff. The 
groups ranged in size from about 30 to 3,000, but the median size 
was 300. In all these studies the tests were given, or other data 


collected, on entry and the recruits’ success or failur^ traced later, 
usually at the end of training. They do not include validatory trials 


of new tests. ■ 


Soon after the introduction of regular selection procedure (the 
General Service scheme) into the Army, 2,600 recruits in some 
twenty representative jobs were followed up, and the .test battery ' 
was modffied in the light of the findings. Numerous subsequent 
investigations were made of jobs where specific information was 
needed, e.g. in order to set appropriate test standards. In the main 
A.T.S. follow-up, some 6,000 auxiliaries were studied in the 
twenty-seven commonest jobs, the median size of sample being 
108. Validation results in the R.A.F, are given in Chapter XVI. 

The outstanding facts revealed by this work were the value of a 
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general all-round measure such as T2 or Summed S.G., and the 
rather small differentiation between different types of job by the 
more specialised tests. The first point is well illustrated by Figure 4 
which indicates the range of T2 among men employed in diirty- 
six representative naval branches*. The bars show the 90th and 
10th percentile scores and the middle lines the medians. Thus the 
top 10 per cent, of R.N. engineer officers score 168 or oyer, the 
bottom 10 per cent, score 126 or under, and the middle man in the 
group obtained 149. It will be seen that there is excellent differ¬ 
entiation between the more and the less highly skilled branches. 
For example, less than 10 per cent, of R.N. executive officers and 
of ordinary seamen overlap, and the same is true of electrical 
mechanics and stokers, or of writers and stewards. 

The results for the main S.P. tests are summarised in the follow¬ 
ing Tables. Tables XXV and XXVI show the range as well as the 
median coefficients in all comparable naval and A.T.S. studies. 
It may be seen that the multiple correlations for the best weighted 
battery of teats average *47 in both Services.. All the other columns 


Table XXV.—Raw Validity Coefficients of Standabd N.wal Selection 

Tests 


Test 

Matrix 

Shipley 



Squares 

T2 

Multiple 
r {un- 
corr.) 

90%Ue 

•45 

•40 

•44 

•67 

•38 

■67 

■70 

S0%ile 

•28 

■30 

•28 

•36 

•22 

■40 

■47 

10®/iile 

■10 

■11 

■13 

•17 

•06 

•20 

•32 


Table XXVI.— ^Validity Coefficients of Standabd A.T.S. Selection Tests 
Cobeected for Multivariate Selectivity 


Test 

Matrix 

Bennett 



Clerical 

Multiple r 

Un-corr, 

Corr, 

00%ile 

•06 

•41 

' ■ ^9 

•61 

■06 

■09. 

■84 

60%ile 

■49 

•30 

•61-. 

•40 

•66 

•47 « 

•66 

10%ile 

•27 , 

•19 

•26 

•20 

•36 

•36 

■60 


in the A.T.S. Table have been corrected for selectivity, that is, they 
show the correlations to be expected if Mnselected groups of 
auxiliaries had been sent forward for training. Thus, the median 

* Almost all of these figures were obtained from groups of recruits who 
passed^their training in ,1943. Standards have often changed since then and are 
very different in the peace-time Navy. 
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multiple correlation of *47 rises to -06 when this selectivity is 
allowed for (in so far as statistical techniques are capable of malfing 
such allowance; cf. Chapter VII). The size of these average or 
median coefficients is distinctly greater than would have been 
anticipated from civilian experience, though obviously still too low 
for accurate selection in the absence of other information. But the 
Tables show that the correlations varied considerably in different 
investigations. It was noted that the highest coefficients were 
usually obtained in jobs involving lengthy training, including a fair 
amount of theoretical work, where the final assessments of pro¬ 
ficiency were baaed on thorough examinations, and where no 
scheme of selection run by psychologists was already in operation*. 
The lowest coefficients occurred in jobs where the work is highly 
specialised (such as radio mechanic), or where previous trade 
experience is of paramount importance, also in jobs (such as sea¬ 
man) where assessments of efficiency are based more on personality 
qualities, e.g. dependability,than on any definite skill or l^owledge, 

Validities in Different Types of Jobs 

We would naturally'have expected the verbal and educational 
tests to show relatively low validities in mechanical and practical 
occupations, and the mechanical-spatial tests to be of value only 
among mechanics. But such differentiation was conspicuously 
small. In some jobs all the tests might achieve high coefficients, in 
other branches all low, but the relative validities of the different 
tests were remarkably uniform. Tables XXVII and XXVIII com¬ 
pare the mean validities in three main types of work in the Navy 
and the Army, and while they show that the mechanical and spatial 
tests were, indeed, relatively less useful than verbal-numerical 
ones in clerical and communications jobs, yet they werejust about 
as useful among seamen, infantry, officers and gunnery ratings as 
among mechanics. 

In the Navy the Mathematics test obtained the highest validity 
in most branches and the Squares test the lowest. Often, indeed, 
the Mathematics test gave better coefficients than T2—^the sum 
of four tests. The Bennett Mechanical,test appeared to give useful 

* It is likely that still higher validities would have been_ obtained had the 
Forces made any use of objective tests of attainment, as did the U.S. Army 
and Navy. Ordmary examinations and gradings (Criteria of Types C and D, 
cf. Chap. VII) are always too unieliabie to bring out the full value of the tests. 
Moreover, the similar make-up of new-type examinations and paper-and-pencil 
tests undoubtedly boosts the coiielstionB. 
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Table XXVII. —^Mean Validity Coefficients of Selection Tests in 
Clbkical, Mechanical, and Other Naval Branches 


Naval Branches 

No. of 
inves¬ 
tiga¬ 
tions 

1 

Mat¬ 

rix 

Ship- 

Ben¬ 

nett 

Arith. 


T2 

Writers, supply, signal¬ 
men, telegraphists 

6 


•38 

•10 

•44 

•18 

■42 

Elec, mechanics, air fitters 
and mechanics, en¬ 
gine room and motor 
mechanics, artificers, 
stokers 

18 


•27 

•33 

•38 

•26 

■41 

Seamen, officers, gunnery 
ratings, leading seamen 

11 


•36 

•33 

•37 

■24 

■42 


predictions of general practical rather than of specifically mechan¬ 
ical proficiency, and the same was true of the Assembly test in the 
Army. Although the Bennett test is not usually appropriate for 
women, yet, together with the “Mec Assembly” test, it showed 
greater differential value in the A.T.S. than in the men’s Services, 
presumably owing to the lesser diversity among women of previous 
trade experience. In the Army the Clerical and Verbal tests vied 
with arithmetic for top place in most branches. Clerical was out¬ 
standing, for example, among drivers and in officer selection. In 
the A.T.S. the same test did best, followed by Arithmetic and 
spelling. It is noteworthy that American experience seems to have 
been similar. ,A large-scale validation of the Army Alpha test in 
1918 gave an average correlation of 'Bi with officers’ rankings of 
recruits for “value to the Service” (Yoakum and Yerkes, 1920). In 
a follow-up of some 760 to 1,000 men in six typical naval jobs on 


Table XXVIII, —^Mean Validity Coefficients of Selection Tests in 
Clerical, Mechanical and Other Army Jobs 
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M 



Test 




ArmyjiA 

No. 

i 

m 

KBl 


4 

Squ. 

17,26 

Ver¬ 

bal 

12,21 

Cler¬ 

ical 

8 

Ass. 

16 
AgU- 
iiy ■ 

Clerks, storemen, sig¬ 
nallers . 

767 

•36 

■28 

•47 

•23 

■42 

g 

•12 

•22 

Drivers, linemen, in¬ 
strument and radio 
mechanics 

1,666 

‘28 

■84 

•40, 

■32 

•22 

•36 


•22 


Gunners, layers, and 
riflemen 

1,241 

•13 

■34 

•26 

'34 


■23 

•10 
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board large American ships, the highest average validity was 
obtained by an arithmetic test, followed by a verbal intelligence and 
mechanical Imowledge and aptitude group tests. "When the U.S. 
Army revised its general classification test for all recruits, it chose 
a battery consisting mainly of arithmetic and reading tests, pre¬ 
sumably because such tests had proved most valid for Service 
purposes. 

The conclusion followed, in this country, that almost all 
branches of the Forces wanted the same type of man—one with 
good education (especially in mathematics) as well as good intelli¬ 
gence. Thus the main use of tests was for apportioning ^e available 
supplies of high-quality men among the different branches accord-' 
ing to their needs. Differentiation between jobs was based more on 
interests and interview judgments. Several possible explanations 
of this result may be proposed: 

(1) It is shown below that teats like Clerical and Arithmetic are 
more reliable than mechanical-spatial or other vocational 
tests. This alone would help to raise their validities, 

(2) Although such tests do depend on education or the o : ed 
factor, they actually have as high ^-saturations as non-verbal 
tests such as Matrices and will, therefore, correlate well with 
any job proficiency involving g. 

(3) It is possible that success at these tests depends on certain 
personality or temperamental qualities such aa stability and 
persistence, in addition to g and education, which are 
relevant to vocational success. 

(4) The extreme heterogeneity of recruits in g and education is 
certainly an important factor. Applicants for any one type of 
civilian employment would be unlikely to range from uni¬ 
versity graduates to mental defectives. Most people do not 
a!pply for jobs which are quite outwith their capacity. In 
a more homogeneous group there would be greater scope for 
specialised tests. 

(6) Jobs in the Forces tend to be more variegated than most 
civilian employments. Both at the semi-skilled level (e.g. 
seamen, infantry) and at the highly-skilled (e.g. radio 
mechanics), the recruit is not engaged in one specific type 
of work. Even when he is mainly concerned with particular 
equipment and machines he needs a good deal of adapta- 

' bility to be able to service old, or freshly introduced, models. 
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Assessments of his proficiency are likely to refer to numerous 
different duties, hence general tests may be more appro¬ 
priate than specialised ones. 

(6) American Service psychologists have presented evidence 
from several naval jobs to show that mechanical and other 
specialised tests are more valid than verbal educational testa 
when the criteria consist of objective measures of proJSciency, 
but less valid when they consist of instructors' gradings. 
The implication is that such gradings are based more on 
written work and on general impressions of brightness than 
on practical skill. While it is true that the criteria employed 
in this country consisted largely of subjective gradings 
(Types C and D rather than Type A), the superiority of 
verbal tests certainly did not occur only when these gradings 
involved written work. 

(7) Admittedly psychologists in the Forces were able to devote 
little time to the production of specialised tests and were 
much handicapped by shortages of material, etc. It is quite 
possible, therefore, that tests with greater differential value 
might have been found. Nevertheless, many'such tests were 
tried out—some are described below—^but were usually 
found to add so little to the multiple correlations obtained 
with the standard tests alone that they were not pursued. 

(8) Most of the validatory criteria consisted of training marks 
or grades rather than assessments of operational efficiency. 
Thus, tests with educational content might be expected to 
be the most successful in selecting men who could be trained 
rapidly. Numerous attempts were made to carry out opera¬ 
tional follow-up, none entirely adequate. However, several 
experiments are worth citing, and so far as they go they do 
not support this explanation. 

Operational Follow-up Results 

Assessments of efficiency during fighting in Italy were collected 
for 200 Royal Marine signallers. Naval selection tests were applied 
fl/tcr the return of the unit to Britain, and the rem ar k ab ly high 
correlation of '624 was obtained with T2. The numbers of good 
and less good men in each T2 S.G. are shown in Table XXIX. 
For the four component tests the coefficients were all between "46 
and '49. Thus Mathematics appears to retain good predictive value. 
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Table XXIX.—T2 S.G.s of Signallers Obtaining High ob Low 
Assessments for Operational Efficiency 


Assessment 

T2 Grades 

Total 

A 

B 

C 

D or E 

Competent or better . 

43 

IBM 

46 

4 

169 

Passable .... 

1 


12 

2 

22 

Below standard . 

1 

■■ 

10 

7 

19 


IBI 

74 

68 

13 

200 


but the spatial Squares test does improve when compared with an 
operational criterion. In Coastal Forces, 186 motor launch and 
motor gunboat ratings were assessed on several qualities by their 
officers (by the conference method, cf. p. 110). Correlations with 
“fighting qualities” were around zero, but with “dependability” 
the coefficients for Abstraction, Mathematics and Squares were '26, 
•17 and *30 respectively. Thus in this instance, the spatial test did 
show up better than v : ed. tests. 

Some 7,000 men in all Arms of the British Liberation Army 
were assessed on the questionnaire reproduced on p. 109 at the 
conclusion of hostilities with Germany, and the test scores which 
certain groups had obtained on recruitment were traced. Among 
260 infantry the Arithmetic test gave a correlation of *263, and all 
the other tests smaller coefficients. While this figure is low, the 
reliability of the assessments may be largely responsible; for the 
correlation of the operational assessments with efficiency gradings 
given shortly before the invasion was only -297. 

In the A.T.S. over 600 trainees for anti-aircraft duties were 
followed up, and later some 1,300 were assessed for efficiency after 
serving two or more years. The average validity coefficients 
(corrected for selectivity) are shown in Table XXX. Most of the 
tests, and the multiple coefficient, drop considerably in validity at 
the operational stage. But Spelling achieves a much higher 


Table XXX. —^Mean Validities of A.T.S. Selection Tests in Several 
Anti-Aircraft Joes at Different Stages 


Test 



Arith. 

Stptares 

Clerical 

SfeU- 

tng 

Mtd- 
tiple r 

TraininK stage . 

•47 

•24 1 

•63 

•H 

•42 

•00 

•66 

Operational 

•36 

•21 1 

•30 

•26 

•37 

•31 

-43 
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coefficient, and the Clerical test is relatively more valid than at the 
training stage. 

Another approach vras to contrast the test scores of men pro¬ 
moted to higher naval rates with lower rates, on the assumption 
that the Navy would select the most efficient rather than the most 
educable for advancement. The T2 differences among gunnery 
and torpedo rates are portrayed in Fig. 4, and these, together with 
the differences on other tests are expressed as correlation ratios 
(which are similar to correlation coefficients) in Table XXXI. It 
happens that the absolute correlations are much higher in the 
torpedo than the gunnery branch, but the relative order of valid¬ 
ities is the same in both, and the mathematics test greatly surpasses 
the others, including T2. 


TABI.E XXXI.— COBKEIATION RATIOS FOB NaVAL SELECTION TeSTS 


Natial Rates 

iVo. 

^ Shipley 

Bennett 

Maths. 

Squ’res j 

T2 

Four A.A. gunners rates , 

1,336 

•14 

•27 

•38 

•16 

•81 

Three torpedo rates. 

264, 

•31 

•46 

•68 

•39 

•63 


Conclusions 

Most of the above evidence appears to show that the uniformly 
high validity of verbal-educational tests is not primarily due to the 
use of training results as validatory criteria, and the explanation 
must be sought in the other seven reasons. The volume of evidence 
as to the value of general group tests, and particularly of tests such 
as Arithmetic and Clerical, is so large and is so unanimous from all 
the Fighting Services that we can safely say that similar tests 
would be of great value for vocational procedures in peace-time 
education and industry, with the following provisos: 

(1) The tests must be carefully constructed to suit the popu¬ 
lations and the conditions under which they are applied. In 
succeeding Chapters it is shown that several of the currently 

' available tests were inappropriate, especially among the 
duller strata of the population. 

(2) Such tests are most suitable among school-leavers and other 
heterogeneous groups. They should not be expected to show 
the same validity in the selection of applicants for any 
particular job. 

(3) The motivation of the testees must be adequate, and they 
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should: 18 far as possible, have had equal amounts of 


previous practice with tests, 

(4) Such group tests ehieJy Mcate the all-round proficiency, 
adaptability, and acceptabifity to employers, and not any 


particular skills. In other words they show the general level 
of occupation for which a candidate is fitted, and do not 


readily diferendate between dlerent occupations at the 
same level At the same time it is easier to determine this 


level and to add supplementary tests where needed, than to. 
follow the principle of prepai'ing an elaborate battery of 
specialised tests for each specific occupation. When full 
weight is'given to interests, previous experience, etc,, as well 
as to this general level, it may often be found that there is not 
very muth' left over which requires testing. 



CHAPTER XIII 


VERBAL INTELLIGENCE AND EDUCATIONAL TESTS 

Abstract .—Ordinary verbal intelligence tests are shown to be less 
appropriate for adults (particularly those of below average ability) 
than tests of the abstraction type. An account is given of the merits 
and defects of clerical tests, oral directions, vocabulary, reading 
comprehension, verbal fluency, dictation and spelling, arithmetic 
and mathematics, and dial and instrument reading tests. Examples 
are given of the tests found most useful in the Services. Selected 
' follow-up investigations illustrate the value of such tests among 
Army lorry drivers, naval asdic operators and writers, and R.N.V.R. 
officer cadets. 


Most of the intelligence testa in common civilian use consist of 
batteries of sub-tests, each sub-test containing numerous choice- 
response items such as vocabulary, analogies, classification, com¬ 
pletion, reasoning problems, etc., and having a strict time limit of, 
say, two to ten minutes. Sometimes, in the so-called omnibus test, 
items of all types are mixed up, and are explained and practised 
before the test proper begins. 

Several such tests were quite widely applied in the Services, 
including the N.I.I.P. Group Test 33, and the first parts of the 
F.H.3 or F.H.R. tests and of Heim’s (1947) A.H.4. An omnibus 
test known as V.I.T. (Verbal Intelligence Test), containing 120 
items to be answered in twenty minutes, was devised for the Army. 
While these were quite useful in testing high-grade personnel such 
as officer candidates—^indeed, V.I.T. was probably the best of the 
standard W.O.S.B. tests—^they were not found satisfactory among 
average and low-grade recruits. They tend to be unduly wordy and 
dependent on the testees’ literacy, and a considerable proportion 
of testing time had to be spent on giving instructions and trying 
out sample questions. This added to the responsibility of the 
tester, particularly when separate timing of short sub-tests was 
involved, and as it was necessary to employ large numbers of not 
very thoroughly trained or experienced testers, the results appeared 

220 
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sometimes to be seriously affected. Another drawback was that the 
conventional test items and their choice-responses often struck 
senior officers as artifi,cial and quite unrelated to the soldier’s or 
sailor’s jobs. It was very necessary in the early days of personnel 
selection to win their good willj and it was not easy to answer the- 
criticism which they sometimes raised, namely, that tests of this 
type are chiefly constructed by university psychologists for testing 
school-children, and are inappropriate for men and women who 
have long grown unaccustomed to paper-and-pencil work and to 
mani pulating verbal symbols. Our results on changes in ability 
between 14^18 years (Chapter XI) lend some support to this view. 
Factorial analyses showed that V.I.T. and similar tests are good 
measures of g, but that they are also dependent on verbal-educa- 
tional^ability. Thus it is misleading to regard them purely as 
measures of intelligence. Finally, the fact that they mostly have to 
be done at inaximum speed is objectionable, for, as Cattell (1943) 
sho-ws, the performance of adults is more seriously affected by age 
in speeded tasks than in tests with a generous time limit. A novel 
form of verbal test, known as abstraction, appeared to provide the 
answer to these difficulties. 

Abstraction Tests 

Early in the war a pair of short tests of vocabulary and abstrac¬ 
tion was published in America by Shipley (1940). They were 
intended for mental or brain-injury patients whose reasoning 
capacities often deteriorate more seriously than does their “crystal¬ 
lised” intelligence, shown by knowledge of words. The vocabulary 
test (S.P.6) was occasionally used in the Forces, but was replaced 
by the more suitable tests described below. Adaptations of the 
second test, however, were very widely applied in the Navy (Test 1 
and R.C. Test D) and by War Office selection boards (S.P. 45). An 
abstraction test contains items like the following*: 

Wherever you see an asterisk, one letter or number is missing. 

Write in the missing letters or numbers. 

L M N O P Q * 

5 16 26 36 46 ** 

big little rough smooth hard **** 
oz. lb. stone cwt. *** 

* No items quoted here are taken directly from tests -which are in use, but 
ar4 similar to them. The instructions are usually abbreviated. 
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346 467 668 *** 

luck lick lack foul foil **** 

quay key owe oh son *** 

AzM bXn DvO gTp *** 

A remarkable variety of items can be expressed in this form, 
including the familar analogies, number series, cipher problems, 
and so on. (Items analyses of several tests showed the alphabetic 
series and cipher types to be the best of all.) Their content can be 
given a nautical or military flavour if this is desired. Very easy or 
very difEcult items can readily be constructed, and the instructions 
needed, even with dull subjects, can safely be cut to a minimum. 
The reliability coefficients of tests with only twenty items and a 
ten-minute time, limit are quite Mgh in representative adult popu¬ 
lations. The scoring is completely objective, i.e. there is no danger 
of the appearance of alternative or partially right answers, yet at 
the same time there is none of the artificiality of the choice- 
response type of test. The test is, indeed, a perfect illustration of 
Spearman’s eduction of relations and correlates. Its dependence 
on g is very high, on verbal ability or education very low. In the 
writer’s opinion this, and the following instructions test, provide 
the most important developments in group-testing technique 
made in Britain during the war. 

Clerical and Instructions Tests 

I 

A clerical test was constructed in 1942 for Army and A.T.S. 
derks, which resembled the N.I.I.P. Group Test 26, but which 
required the testees to perform all the operations—checking, filing, 
classifying and coding printed information—^in rapid rotation 
instead of in separate sub-tests. This was proved to be a good test 
for its original purpose, as also was the Institute’s test, but to 
everyone’s surprise the new test was found highly effective in 
selection for numerous other Army categories, as shown in the 
previous Chapter. There seems to be no necessity to confine this 
type of test to derical material, and the following is an example of 
a possible alternative form: 

On each line you have five things to do: 

1. Look at the two words at the beginning of the line. If they 
' start with different letters, write a X in column (1). If they 
start with the same letter, leave a blank. 



VERBAL INTELLIGENCE AND EDUCATIONAL TESTS 223 

2. Count up the total number of vowels (A, E, I, O or U) in 
both words and enter it in column (2). 

3, 4. Classify the longer of the two words under the following 
headings; 

' C. Names of countries. 

L. Living thinp, animals or plants. 

N. Non-living objects. 

If the first word is longer, write C or L or N in column (3). 
If the second word is longer, write C, L or N in column (4). 
Leave the other column blank. 

6. In column (6) write a X if the two words belong to the same 
class. Leave a blank if they are different. 

Go through the three examples below, and then work 
across each line*, answering as many of the thirty lines as 
you can in ten minutes. 


- 

(1) 

(2) 

(3) 

(4) 

(S) 

France 

England 


X 

4 , 


C 

X 

Motorcar 

Wmes 


X 

' 5 

N 



Elephant 

Emerald . 

• 


6 

L 



1 Chair 

Bed . 

, 


1 


V 


2. Rose 

Russia 

« 






Etc. 

. 

• 



1 




It is difficult to conceive why such a test should be successful, 
for it appears at first sight to suffer from most of the defects of 
verbal tests listed above. Yet factorial analysis proves that it is as 
good a test of general intelligence as abstraction and better than 
Matrices. It is not greatly affected by education, but does involve 
a purely clerical factor to a small extent. It seems to depend on: 

(a) Comprehending the rather elaborate instructions. For this 
reason the test, as used in the Army and A.T.S., was re- 
christened. 

(b) Learning the instructions rapidly, in order to avoid going 
back to the beginning and consulting them frequently. ’ 

(c) Good vocabulary and ability to abstract the meanings pf the 
words under the given headings. 

■ (d) Mental flexibility and speed, or ability, to change over 
' quickly from one operation to the next. 

* It is desirable to make the operations inter-connect as closely as possible, 
otherwise testees will be tempted to work down each column in turn. 
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r 

(e) Thinking ont efficient methods of work, e.g. doing two oi 
three operations simultaneously. > 

(J) A temperamental factor of drive, for maintaining the 
necessary concentration throughout. 

Although (d) and (/) would not normally be regarded as com¬ 
ponents of general intelligence, they may enter into many occu¬ 
pations, and so enhance the vocational value of the test*. 

Directions Tests 

The ordinary written directions item does not appear appro¬ 
priate for adults of sub-average intelligence, for example: 

If the fourth word in this sentence is longer than the fifth 
word, write down the second letter of the last word; if not, write 
down the last letter of the third word. 

But an oral directions test where complex instructions are read 
out by the tester was used successfully in the Army Alpha battery 
and, in this country, gave useful predictions in several jobs such as 
asdic operator and motor driver. Men who are unaccustomed to 
reading and writing are much less handicapped than with a printed 
test. Moreover, efficiency in the Forces depends so largely on 
understanding of, and prompt reaction to, oral directions that this 
test might be regarded as a work-sample vocatiorial teat. Unfor¬ 
tunately it suffers from one fatal defect, namely, its dependence on 
the tester’s enunciation and manner of application. It can hardly 
be used unless always given by the same tester, and under really 
quiet conditions. It was not possible to standardise the teat by 
putting the directions on to a gramophone record, since adequate 
sound reproduction could not be ensured in all testing (;entres. 
Moreover, in one experiment vdth dictation tests, gramophone 
records yielded distinctly poorer scores than oral testing .(even 
though the recording was made^by a B.B.C. aimouncer). 

Vocabulary and Reading Tests 
A test which explicitly involves ability with words was needed, 
for several jobs, including clerks and officers. In the Royal Navy, a 
reading comprehension test (S.P.96), borrowed from the U.S. 

* One defect u the abnormality of its score distributions. Nearly 10 per cent, 
of a representative population can hardly get started on it, and almost as many 
at the top end gct nearly perfect scores, unless 1lie time limit is reduced. Su<m 
a U-shaped distribution is advantageous for most selection purposes, but is 
highly inconvenient to the statistician. 
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Navy added considerably to the standard tests. This test contains 
six passages and thirty choice-response questions requiring the 
abstraction of information from the passages. A much simpler test 
(S.P.22), in two parallel forms, was devised for semi-illiterate 
Army recruits and standardised on children to yield reading ages. 
This was used in an experiment on the effects of basic education. 
courses, where it was shown that the improvement among illiterates 
sent on a six-weeks' course by their own units was small and 
irregular; but that among men selected by P.S.O.s and psychia¬ 
trists as likely to benefit from the courses, the improvement was 
much larger. Two tests of attainment in English, chiefly involving 
reading comprehension and vocabulary, were included in the 
R.A.F. air crew batteries and one of these (Gen-B) was found to 
give promising predictions in all categories except pilot. 

For individual testing, e.g. of naval neuropsychiatric patients, the 
vocabulary, comprehension and similarities tests of the Wechsler- 
Bellevue scale (S.P.6) were used (cf. Trist, 1941). With Army and 
A.T.S. officer candidates, the Mill Hill vocabulary test (Raven 
and Walshaw, 1944) was satisfactory. But when older, serving 
officers came up for re-allocation both this and other verbal tests 
often seemed to arouse a good deal of anxiety, and a more suitable 
test was devised consisting of fifteen words (such as form and bit) 
for each of which four different meanings are to be given in' writing 
in fifteen minutes. This test, the abbreviated Wechsler, and 
Shipley’s vocabulary, all have much the same factor content, 
showing fairly high dependence on g, but also having large verbal- 
educational loadings. 

More reliable than any of these was the Army’s Verbal test 
(S.P.25), based on synonyms, homonyms, and rhymes. Specimen 
items are shown below: 

Write down on the dotted line a word which means nearly 
the same as the \vord on the left, and which starts with the two 
or three letters that follow: 

Example commence. begin. 

1. THEFT. LAR. 

Write a third word on the dotted line which rhymes with the 
left-hand word and means nearly the same as the right-hand 
word: 

Example > ranged .Change. ...alter 

1. . LARGE 


p.a.—8 
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Write a third word on the dotted line which means nearly the 
same as the left-hand word, and, in a different sense, means 
nearly the same as the right-hand word: 

Example relish .Sauce. impudence 

I 1. FIRM.RAPID 

Like the abstraction test, this has the advantage of objective 
scoring without choice-responses. (Indeed, Shipley’s original test 
contains an item of the third type just quoted.) 

The main R.A,F. battery, devised by Stephenson (cf. Chapter Y) 
has a V or verbal section. This too, contains synonyms and 
opposites, reading comprehension, and questions on verbs, nouns, 
and adjectives, e.g,; 

Make adjectives (describing words) from the following nouns 
(naming words). The first has been done for you. 

Example leaf Leafy 

1. HEAT . , 

2. AFFECTION . 

Dictation and Spelling Tests 

Ordinary passages of prose are unsuitable for dictation, since so 
much time has to be spent on easy, non-discriminating words, 
Several parallel passages (S.P. 70-74) were devised for the Navy, 
each containing thirty-five words,'almost all of which were fairly 
to very difficult. But these were barely long enough to be reliable, 
and, although satisfactory when given by carefully trained testers, 
were undoubtedly affected to some extent by their enunciations 
and accents. In one experiment an experienced A,T.S. officer gave 
the same passage twice, first with good delivery, secondly, with bad 
delivery which did not involve mispronunciation nor mistiming, 
but merely poor articulation and failure to say each syllable clearly. 
The average errors rose from 1-66 to 7-65 under the second con¬ 
dition. Examples of the mistakes that occurred with bad, but not 
with good, delivery are: 
ijji' ,, Successive for excessive 

, In „ and 

Is „ has 

The „ though 
Gratified „ gratifying 
Patrolling „ patrol. 

A better form of test (S.P. 127-130) consists of twenty words, of 
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a suitable range of difficulty, each! incorporated in a sentence 
y^hich helps to define the word and. to determine its tense, part of 
speech, etc. The whole sentence is dictated, but only the critical 
word is written down and scored. For example: 

ONIONS. I grow ONIONS on my allotment. 

SEIZE. siEZE hold of the rope and pull it. 

A speed of twenty seconds per word, including its sentence, was 
found adequate. Even in this test the influence of the tester was 
apparent. For example, five testers, dictating plumbline to com¬ 
parable groups of about 130 recruits, obtained 40, 60,41,34 and 61 
per cent, of right answers. Presumably the second and fifth testers 
pronounced the word much more intelligibly than the others, 
perhaps sounding the B. ' 

Objective spelling tests are the only way out of this difficulty. 
In the A.T.S, test (S.P.14), five incorrect versions and one correct 
version of each word were mixed up. A synonym at the beginning 
of each line helped the testees to identify the word. The right 
spellings had to be underlined. For example: 

RAPID qick kwick qwick quick quic cwik 

GRASP seize sieze sease seez size siese 

Admiralty psychologists adopted an intermediate form consisting 
of sentences in each of which one word was mis-spelt, the testee hav¬ 
ing to find this and write it out correctly (R.C. Test C). For example: 

He is a very qwick runner 
1 Sease hold of the rope and pull it. 

In one investigation, six varieties of dictation or spelling tests 
were given to the same recruits, and it was concluded that they all 
measure the same ability almost equally well. (This conclusion 
would probably not be true among sdhool-children.) All have 
moderate g, but larger verbal-educational saturations. Straight 
dictation, however, probably involves least g, and the A.T.S. type 
of test may depend to a slight extent on a distinct factor of ability 
at clerical work. The straight dictation was the least reliable, but 
the reliabilities of all other types of test were almost equally high. 
The recruits’ preferences were recorded, and again there were no 
outstanding differences. In another factorial study the sentence 
dictation and A.T.S. Spelling tests were shown to measure the same 
abilities as the Array Verbal test. In a representative male adult 
population ability to spell does not seem to be differentiated from 
general education in words. 
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Arithmetic and Mathematics Tests 
Both the Admiralty and War Office psychologists constructed 
tests in two parts, the first consisting of a few minutes of straight¬ 
forward adcition, subtraction, multiplication and division sums, 
the second including,thirty to forty brief questions for which 
eight to ten minutes was sufficient time. For example: 

How many pence in half-a-crown ? . 

Ii + 2J= . 

Increase 60 by 10 per cent. . 

sec A 

cosec A “ . 

Scores on the first part tend to be negatively, on the second part 
positively, skewed. In other words the lower half of the population 
is best differentiated by its skill in rote arithmetic, the upper half 
by its knowledge of fractions, decimals, algebra, etc. The average 
male adult can scarcely manage simple fractions and decimals, and 
is completely stumped by percentages, square roots, and the like. 
The two parts combined provide a highly reliable test which not 
only spreads out the population better, but is also generally more 
predictive, than the typical schoolmaster’s examination, which is 
chiefly based on elaborate arithmetical problems, and which is four 
to ten times as long. We have already commented on the surprising 
vocational value of these tests and the fact that, even in relation to 
practical criteria, they often gave better predictions of proficiency 
in mechanical trades, and among seamen, infantry, and officer 
cadets, than did either mechanical or general intelligence tests. 
The second part was most successful among high-grade groups 
(electrical mechanics, officers, etc.), and the two parts together 
amohg average recruits. The high g-saturation, particularly of the 
second part, suggests that more intelligent adults both learn and 
retain more mathematics than dull ones, and that the test thus 
measures tdaocMity rather than merely education received, 
Though both parts do involve a distinct numerical-education 
capacity, they may also be influenced by temperamental factors. 
For it has been noted that emotionally unstable and delinquent 
children at a child guidance clinic tend to be more retarded at 
arithmetic than at other school subjects. Thus when we find that 
recruits, picked out by naval instructors as never likely to make 
good seamen, are poorer on this test than on any other, this does 
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not mean that the instructora cannot tolerate the badly educated, 
but rather that most of these men are either innately dull, or 
maladjusted individuals who fail to settle down to navd life, just 
as they failed to settle to school life. Similarly, Lummis (1946) has 
shown that bad Army conduct records are associated more closely 
with low scores on arithmetic than on other S.P. tests, and that 
arithmetic is most seriously affected by irregular schooling and 
truancy. It was interesting to observe moderately high correlations 
even with assessments of officer quality and power of command 
among R.N.V.R. cadets. Thus when the top quarter of a group of 
660 assessed most highly was contrasted with the lowest quarter, 
it was found that 66 per cent, of the former and 30 per cent, of the 
latter were acquainted with Pythagoras’ theorem, 78 per cent, and 
68 per cent, knew the cube root of 64, and so on. 

An alternative type of test (S.P.23) was preferred in the A.T.S. 
because the mathematics test was too difficult. (The average female 
adult cannot even manage decimals.) Worked sums were provided 
in which there were three mistakes which the testee had to correct. 
For example: 

I s. d. 

Subtract 456 12 8 

269 6 10 


,£196 6 8 

Such a test, with ten sums to be checked in ten minutes, is also 
very reliable, and measures nearly the same abilities as the others, 
but depends to a small extent on ability at clerical work. From the 
vocational viewpoint, arithmetic was often less predictive than 
spelling among women. 

For its high-grade air crew recruits, the R.A.F. adopted several 
more difficult American tests, all of selective-response type. Mat-A 
tests algebra and trigonometry, Mat-B contains verbal problems, 
and Mat-D rote arithmetic. Mat-C requires testees to find 
approximate answers to such questions as: 

248-4 miles per hour for 20 hours and 16 minutes = (a) 613 
miles, (b) 60 miles, (c) 5,000 miles, (d) BOO miles, (e) 603 miles. 
Mat-F involves reading off the correct entries from numerical 
tables. The two latter tests bear a close resemblance to some of the 
jobs of the navigator. 
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Instrument Reading Tests 

Two other tests borrowed from the U.S. Army Air Force were 
found useful in the allocation of air crew categories, Ins-A shows 
sets of dials, as on an aeroplane panel, marked Altitude, Fuel-Air 
Ratio, etc. There are fifty-seven choice-response items calling for 
the correct reading of appropriate dials. In Fart I of Ins-B sets of 
six dials have to be read and related jointly to descriptions of an 
aeroplane’s velocity, orientation, altitude, etc. Part II shows sets 
of two dials—^Artificial Horizon and Compass. These have to be 
read and related to illustrations,of aeroplanes in flight. In each 
item there are five planes, only one of which can be reconciled with 
the pair of readings. 

A simpler twt (S.P.119) wm devised for naval radar operators. 
The first part shows several pictures of bits of rulers, scales and 
dials, the testee having to read off the numbers' indicated by 
pointers (with or withput interpolation). The second part involves 
reading the co-ordinates of points on a rectangular, and on a very 
simple polar, graph. 


Illustrative Investigations 

Many of the above statements and conclusions about tests may 
appear somewhat dogmatic. It is impossible to give all the evidence 
' upon which they are based, but the following brief outline of some 
of the main validatory experiments will, it is hoped, cover some 
of the ground. 

R^-S.C. Lorry Drivers.—K large battery of tests gave the, 
following correlations with driving proficiency in a group of 240 
men followed.up at the end of training. 


Table XXXII.— ^Validity CoEFncimjrs of Tests for Drivers 


Age ■. 

Matrices' 


Bennett Test 2 ....... . 

.Personality inventory. 

Previous driving experience..' 

Cycling experience. . 

Oral directions test . . . ... 

Questionnaire on driving knowledge .... 
Interest blank scored for drivers ..... 
Group choice reaction time ...... 

Squares, Test 4. 

Interest blank scored fior mechanics . ... 

Judgment of distances ...... 

Judgment of ellipses . . - . 

Judgment of speed 


■260 

■312 

■68S 

■439 

■511 

■382 

■486 

■431 

■423 

■336 

•287 

•266 

•212 

•200 

•160 


( 
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In another similar group of 306 recruits'Chleusebairgue’s (1939) 
elaborate choice reaction time test gave a correlation of -386. The 
multiple correlation of the first four items on the list combined is 
•669, and the figure does not rise to more than -68 or -70 when any 
other test (except driving experience) is added. Note that the three 
analytic testa, borrowed from the N.I.I.P. battery (Miles and 
Vincent, 1934) come at the bottom, and that oral directions is the 
best test after Bennett. The interest blanks and inventories are 
discussed in Chapter XV. 

Asdic Operators .—Several experiments were carried out in anti¬ 
submarine training schools, on the basis of which an omnibus 
intelligence test (Selection Test A), mainly consisting of oral 
directions items, a gramophone test of sense of pitch (Doppler 
effect), and a group choice reaction time, test, were instdled as a 
regular battery. It was also found that Heim’s A.H.4 test, contain¬ 
ing verbal and non-verbal intelligence items, was a satisfactory 
alternative to Selection Test A. Later an individual audiometric 
examination was added by the medical officers. One sample of 282 
trainees, selected in respect of the above battery and A.H.4, but not 
directly selected on T2, yielded the correlations with training 
results shown in Table XXXIII. 


Table XXXIII. —^The Value op Different Intelligence Tests in Selecting 

Asdic Operators 


■ 

, 



1 

Correlations 

Correptedfor 

Selectivity 

Asdic battery 




■343 

■sss 

A.H.4 




■364 

•572 

T2 . '. 

• 

• 

• 

■440 

■697 


When statistical correction for selectivity was applied, all three 
tests or sets of tests showed similar, fairly high, validities. 


Table XXXIV. —The Value op Intelligence and Other Tests in Selecting 

Asdic Operators 


Tests 

Correlations 

Intelligence, Selection Test A 






Group choice reaction time 






Doppler Record .... 






Audiometric threshold . 






Audiometric sense of pitch 







In the course of time the reaction time apparatus, and its method 
of application, became unreliable, l^early four years after the 

I 
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institution of the tests the correlations with training results, shovm 
in Table XXXIV, were obtained among 358 recruits (also' a 
selected group). It will be seen that the intelligence test retains 
much the same vaUdity as before, and the Doppler record assists 
slightly in selectioh. The audiometric tests, though also useful, do 
not appear superior to this group auditory test. 

Writers and Supply Assistants ,—Some 800 of these naval clerks 
were followed up in 1943, and their scores on the standard battery 
correlated with training results. Four to six hundred more were 
given additional testa in 1944^5, with the results shown in Table 
XXXV. (Tests 2 and 4 are omitted since they gave much lower 
coelEcients.) The second set of figures for T2 tests is lower than 
the first because of more efficient selection, and the additional 
tests attain rather higher coefficients than they would have, had 
they also been used for selection. 


Table XXXV. —The Value op Tests in Selection for Clerical Work 


Test 

Correlatums in: 

1943 

1^4-5 

0 Matrix. 

•37 

•27 

1 Abstraction ...... 

•36 

•29 

Sa Arithmetic ...... 

•36 

•20 

3b Mathematics ..... 

•43 

•32 

T2.. . 

•42 

•36 

71 Dictation . . . ' . 

•36 

•33 

21 Army Inatructions ..... 

— 

•43 

Group Test 26 (N.I.I.P. Clerical) . 

— 

•44 

96 U.S, Reading comprehension . 

— 

•38 


Nevertheless, it was found that either of the clerical tests, or the 
reading comprehension, would add appreciably to predictions 
based only on the standard battery. 

R.N.V.R. Officer Cadets. —^T2 was regularly applied to officer 
cadets under training in H.M.S. King Alfred in 1942-3, but several 
additional tests were tried out on successive large groups, mostly 
around 500, which were not taken into account by the Admiralty 
selection boards. The additional variance in final passing out marks 
covered by these extra tests was calculated, but, for simplicity's 
sake, the results are expressed in Table XXXVI in the form of 
partial correlations with T2 (in the first column) or Tests 1 + 3b 
(in the second column) held constant. The validity of T2 itself 
ranged around '636, and the validities of the additional tests were 
between -30 and -55, 
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Table XXXVL— Partial Cobhelations Showing the Additional Value 
OF Various Tests in Officer Selection 


Test 

Partial Correlations •. 

T2 Constant 

1 + 36 Constant 

96 Reading comprehension 

•20 


Mathematics examination. 

•17 


12 Army Clerical. 



21 Anny Instructions .... 



Group Tost 26, N.I.I.P. Clerical 

•17 


46 W.O.S.B. Abstraction.... 

•10 


16V.I.T. 



Group Test 33, N.l.LP. intelligence . 

•19 


6 Shipley Vocabulary . . . , 

•18 

■■ 

109 Figure construction .... 


•11 

97 Memory for designs .... 

•06 


80 Orientation. 

■26 

•24 

Group Test 70/23 non-verbal . ; 

— 

1 

Progressive Matrices, 1942 

■■ 



The Clerical or Instructions tests certainly have a contribution to 
make, and are probably superior to the conventional form of 
clerical test (Group Test 25). Reading comprehension, too, is use¬ 
ful, and .possibly the mathematics examination set by the training 
establishment itself. Other verbal intelligence tests are of more 
doubtful value, and non-verbal ones appear quite useless. Two 
spatial tests add very little, but a verbal orientation test (discussed 
in the next Chapter) seems to be one of the best. Unfortunately, 
given to only 166 cases, hence Its results are less certain. 


















CHAPTER XIV 


NON-VERBAL AND MECHANICAL TESTS 

Abstact—The Progressive Matrices and other non-verbal intel¬ 
ligence (g) tests were widely used in the Services. Several tests of 
spatial judgment (^-factor) were tried. Though these are on the 
whole less valuable among adults than adolescents, e.g., engineer¬ 
ing apprentices, there is some evidence that they are predictive of 
“practical” ability. Performance tests were developed for appli¬ 
cation to African and Indian recruits and other special cases. 
Orientation and observation testa are desaibed. Among the various 
mechani cal tests, group paper-and-pencil tests of mechanical com¬ 
prehension, mechanical and electrical information and trade bioW^ 
ledge, were generally more predictive than practical assembly or 
trade tests. Illustrative investigations show some of the results 
obtained among naval radar operators, boy tradesmen, radio and 
electrical mechanics. 


The Progressive Matrices test (Raven, 1939) was adopted as the 
primary general intelligence test in the Royal Navy, Army am* ' 
A.T.S. in 1941, largely in order to ward off criticisms of the educa¬ 
tional bias in verbal tests. An item of the Matrices type is shown in 
Fig. 6. This test has been applied to greater numbers of mei^nd 
women in this country than any other single one. For ease of 
administration a twenty-minute time limit was usually imposed. A 
harder version (unpublished) was constructed by Raven for testing 
Army officer candidates in 1942. Numerous factorial analyses 
showed that, while Progressive Matrices is an almost pure g test, 
it does involve the visuo-spatial or A factor to a small extent. For ^ 
v ocationaLp urposes it was somewhat .disappointmg. Three reasonT 
■mayTesuggested. First, the success of such teats as instructions 
and Arithmetic showed that the. Services did not actually, want pure 
intelligence, a 9 ,inuch as intelligence and,education. The compara- 
tively^mall extent to which it dmerentiates between, different 
occupational grades has been pointed out in Chapter XI. Secondly, 
whenever a battery of verbal and mechanical-spatial tests was 

284 
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236 



12 3 4 



5 G 7 8 



Fio. 6. Specimen Item as in the Fhogbessive Matrices Test. Which op 
THE Numbered Pieces 1-8, if Fitted into the Empty Space Above, Would 
Complete the Pattern? 


applied, their combined result yielded a considerably better mea¬ 
sure than did Matrices alone. Only when its visuo-spatial com¬ 
ponent was involved did it seem to have anything to add to such a 
battery. Thus its best, 90 ^el 3 .ti^PS^ 5 PCT^ proficiency in visual 

signalling and radar operating ^d ^ong A.T.S. recruits engaged 
on various meSTaumcarand anti-aircraft duties. The third r eason is 
its rather poor reliabili^, and its su^sceptibility to non-i nt^ te pSa l 
i nfluenc es. More tira n a ny oHier group test it is affected by age ^d 
ot her types of 

enabtidnal stress. The improvement in scores after attendance at 
physical' development courses was described above (p. 201). 
Dr. J. A. Fraser,Roberts was able to show, by a detailed compari¬ 
son "with more reliable tests, tlat tljejjiureliability is greatest in the 
16-30 scorb range, i.e. just, about Hhe level where acceptance or 
reje^n for the Services takes place. Its efficiency is greater the 
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higher the score (in spite of the negative skewness of its distribu¬ 
tion), and also at very low score levels. Doul3it]i,eap;,„tlie.test is rjjigfe 
[valuable when.given without time limit an experienced clinical 
IpsycKologist who can interpret the significance of failures on 
IpSrficulaFfesf ttenil No evidence could be obtained, however, that 
‘’men with scores 'which Raven calls “unreliable”, i.e. irregular 
patterns of scores on the five sections of the test, were more 
neurotic or in any other way diflterent from men with “reliable” 
scores (cf. Eysenck, 1944). The test is seldom given now in the 
Services, mainly because its wide use has caused it to become 
rather well known. 

Another matrices test, where the testee draws in his answers 
instead of choosing one of a number of patterns, was devised by' 
Stephenson for the g section of his R.A.F. battery. Though some¬ 
times diOicult to score, this version seems to be more reliable. The 
N.LI.P. Group.Test 70, which was occasionally used in the Navy 
and Army, has three parts of which the last two resemble Progres¬ 
sive Matrices. Part I is a very simple but surprisingly effective test. 
Five two-line drawings are shown, with a long series of incomplete 
bits of these drawings. The testee has to identify the complete 
drawing to which each bit belongs. In the absence of definite 
evidence, it is possible that this test involves g for much the same 
reason as the instructions test, namely, rapid learning of the com¬ 
plete drawings, mental flexibility, and continuous concentration. 
Like Matrices, all parts show some dependence on visuo-spatial 
ability. An alternative test to Matrices was constructed for the 
Army from domino patterns. This Dominoes test, though less 
attractive to the testees, is more reliable and moreg-saturated than 
Matrices and shows no visuo-spatial element, but may involve 
number-ability to a small extent among dull and backward recruits. 

Spatial Judgment Tests 

The spatial test most widely used in the Navy, Army and A.T.S. 
was the N.I.I.P. Squares test (cf. Fig. 6). This consists of a series 
■ of fifty figures in each of which the testee has to draw a dividing 
line such that the two pieces so formed would, if turned around, 
make a square. The N.I.I.P. Group Test 80 (devised by Slater, 
1940) was occasionally given to Army engineer officer candidates 
and R.A.F. apprentices. This is based on the recognition of shaped 
when turned around or shown mirror-wise. Though the Institute's 
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Form Relations testwas seldom employed because of its heavy paper 
consumption the accompanying Memory for Designs (S.P.97) was 
often given in the Navy. It W the drawback of being very difScult 
to score. A test based on the copying of straight-lme figures on to 
a space marked out with dots has been used by Stephenson, 
McQuarrie, Slater (“Figure Construction Test”) and others (cf. 



Fig. 6. Squahes Test. In the Left-hand Fiouhb, a Line Drawn From A 
TO B Would Divide the Figure into Two Pieces Which Could be put 

TOGETHER TO MAKE A SQUARE. IN THE RiOHT-HAND FIGURE A SINGLE LINE 

can be Drawn Between Two Points, Dividing it into Two Pieces Which 
Would Make a Square. 


Fig. 7). An extension of this was to have some of the figures 
reversed mirror-wise or turned through 90 degrees before being 
copied. Since the testees’ responses are creative instead of selective, 
there is a slight subjective element in scoring their correctness. 
Still another type of spatial test is the paper formboard (cf. 
Paterson and Elliot, 1930). The selective-response versions of this 
test, sometimes used in America, appear rather artificial. But a 
creativc'rresponse version tried out in the Navy had to be dis¬ 
carded because of the subjectivity of scoring, in spite of the great 



Fig. 7. Figure Construction Tet. 
Space to 


Copy Each Figure on to the Dotted 
THE Right of it. Each Link Must Start and End up at a 
Dot. Begin at the Dot With a Circle Round it. 
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advantage that this test can be “got across” to dull recruits more 
readily than any of the others discussed here (cf. Fig. 8). Stephen¬ 
son’s A-test contains paper formboard, figure construction and 
mirror-reflection items. 

The precise significance of these tests, and the extent to which 
they measure the same abilities or factors as non-verbal^ tests like 
Matrices, or performance and mechanical tests, are obscure. While 
they were originally designed as intelligence tests (for example in 
Army Beta), many investigations such as Kelley’s (1928), El 




Fio.,8. In the Left-hand Figure, the Two Blank Pieces Could be Fitted 
Tooetheh and Turned Around to give the Square. Draw Lines in the 
Right-hand Shape to Show How the Four Black Pieces Could be Fitted 

Together. 

Koussy’s (1936) and Thurstone’s (1938) have shown that they 
emlm^ a sjiace factor—^named “k” or “S”—distinct from g, 
whose essence (according to El Koussy) is the use of visual imagery 
for the mental manipulation of spatial relations. Investigations in 
the Services similarly showed that there is little to dhoose between 
the half-dozen tests just mentioned. They all depend to about 30 
per cent, on g and to the same ractent on k, the re mainin g 40 per 
cent, being specific factors or unreliability (none of them is as reli¬ 
able as a verbal or numerical test of the same length). Price’s (1940) 
'study shows that ft is identical wiA the practical factor that enters 
into many performance tests, which Alexander (1935) calls F. 
Though Drew (1947) has recently denied this, his figures actually 
appear to support Price’s. These, and Alexander’s own investiga- 
^ tions, further suggest tha^allegedly pure g tests may contain some 
1 k (as we found with Matrices), and that there is considerable over- 
the “m” factor,present in mechanical tests. 

There is ample evidence (cf. p. 207) of the vocational value of 
ft tests among adolescents with little mechanical experience, but 
very rarely were they found to assist in the selection of adult 
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mechanics in the Forces. Nor did they show any marked correla¬ 
tion with radar operating or other jobs involving prominent visual 
components. Nevertheless, the occasional successes of Squares were 
particularly interesting. Not only did it show up better when com¬ 
pared with operational, as distinct from training, criteria among 
.Royal Marine signallers' and Coastal Forces ratings, but also higher 
rates in the regular (continuous service) Navy did unusually well, 
and advancement correlated with educational ability chiefly among 
conscripts (hostilities only recruits). In the A.T.S., Squares was 
outstanding among dra^ughtswomen and certain anti-aircraft per¬ 
sonnel. There was faii( agreement with practical efiiciency tests 
in infantry. Finally, Army recruits picked out by psychiatrists as 
lacking in combatancy or questionable in emotional stability did 
particularly badly in the test. This evidence suggests that there is, 
indeed, a rather vague and ill-defined factor of practical ability, 
distinct from g and from mechanical ability, which is measured— 
unfortunately not at all efficiently—by visuo-spatial tests. Shortage 
of psychologists and restrictions on paper prevented fuller research 
which might haye led to the development of better tests of this 
type. 

Performance Tests 

The Cube Constructiqn test was sometimes applied by Admiralty 
psychologists to doubtful candidates for mechanic branches, and 
several performance tests were tried out in W.O.S.B.s (cf. p. 66). 
But all these were regarded more as qualitative than as quantitative 
tests, since they threw light on the candidates' methods of tackhng 
problems, and no systematic resiilts were collected on large 
numbers. The performance test most widely used in the Navy and 
Army was an adaptation by Trist and Misselbrook of Kohs Block 
Design, in several ways superior to the Kohs (1923), Alexander 
(1936) and Drever-Collins (1936) versions. A neW series of designs 
was prepared, and scoring was based on the numbers of blocks 
' correctly placed within quite short time limits. This seldoiq takes 
more than twelve minutes to give and is highly disOTiminative from 
about 9-10 year up to auperior"(thbugK hot very superior) adult 
level.' 

Although Spearman and his followers (El Koussy, 1936; Cattell, 
1943) regarded performance tests merely as rather inefficient 
measures of g, containing no separate practical factor, Alexander s 
and other experiments cited above 'show that this is not true of 
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adolescents and adults. The only analysis carried out in the Forces 
indicated that Kohs-Misselbrook Blocks measures much the same 
abilities as Squares, that is^ + ^ or 

Numerous performance tests were devised for both African and 
Tnriian personnel. Though adequate validatory evidence was diffi¬ 
cult to obtain, it appeared that the most successful tests were not 
of the conventional formboard jigsaw or picture completion 
types (even though the pictures were of objects presumed familiar 
to the recruits) but adaptations of tests which worked well in this 
country, namely: 

(a) Continuation of patterns or series made of strips of coloured 
wood (analogous to the Abstraction test). 

(b) Reproduction of figures by sticking pegs into holes (closely 
parallel to Figure Construction). 

(c) Simplified Kohs Blocks. 

Orientation Tests 

A very different type of spatial test contains verbal problems 
such as: 

If when standing on your head you are facing East, in what 

direction is your right arm pointing—N., S., E. or W. ? 

Very promising results were obtained with these tests both 
among Army trainees assessed for orientation or sense of direction, 
and in predicting the navigation marks of officer cadets in the Royal 
Navy. They were never developed further partly because the inclu¬ 
sion of sufficient items for good reliability would malce them very 
long, and partly because it is difficult to prevent testees drawing 
plans unlesfe (as in the Terman-Merrill scale) they are tested indi¬ 
vidually. Factorial analysis of one such test indicated a high 
^-saturation, and a spatial component which only overlapped 
slightly with that of ordinary A-tests. 

Observation Tests 

Three tests borrowed from the U.S. Army‘Air Force were used 
in the selection of air crew. Obs-A shows eight large aerial photo¬ 
graphs, on which various objects or small areas of the terrain are 
lettered. Below each are six small photographs, as it were cut out 
of the large one, and the testee has to match these with the appro¬ 
priate lettered areas. In Obs-B small photographs have to be 
matched with appropriate sections of large coloured maps, which 
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are sub-divided, into a dozen, lettered districts. Oba-C is an aircraft 
identification test. Each item shows a silhouette of an aeroplane in 
the left-hand margin and five other si milar silhouettes with the air¬ 
craft pointing in various directions. One out of the five has to be 



chosen which is identical (apart firom direction) with the initial 
silhouette. 

These tests provided a good illustration of the dangers of judgmg 
from face validity. A and B in particular look as though they would 
be valid. But, though they did correlate significantly with initial 
and basic training marks among navigators, the coefficients were 
disappointingly small. In a factorial analysis all three tests were 
found to embody -1- a group factor, distinct from k, and this 
might, perhaps, be identified with the P or perceptual ability factor 
claimed by some American psychologists (Dvorak, 1947). More 
probably, however, this component is a result of training, some 
applicants for air crew having had more experience than others in 
reading maps and photographs and in aircraft identification. 

Mechanical Tests ' 

The tests chiefly used in the Forces were: 

' jS.P.2, Bennett Mechanical Comprehension and revisions thereof. 
Some forty to-fifty pictures are shown of mechanisms (e.g. gear 
wheels), or mechanical situations in everyday life (e.g. trains going 
round bends, men lifting weights, etc.), and selective-response 
questions are asked, as in Fig. 9. 

A new Practical Problems test for the A.T.S. is based on, pictures 
and questions about cooking, clothes, motor-cars and other things 
of which women might be expected to have had experience. 
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Mec-B, Mechanical Comprehension {Diagrams). —^Eight diagrams 
of machines are shown, and several selective-response questions 
are asked about the workings of each. This test has generally 
yielded better validities in the R.A.F. among pilots, navigators and 
flight engineers than has the pictorial Bennett test, Mec-A. 

S.P.WIE. and M., Electrical and Mechanical Information .— 
Effective ten-minute tests were constructed containing twenty-five 
electrical and thirty mechanical items of the following, simple- 
completion, variety: 

The terminals in a metal electric lamp holder are usually 
mounted in. 

To get a keen edge on a chisel the final sharpening is done 
with. 

It is seldom possible to devise items where only one response is 
correct (these two were rejected because they produced so many 
alternatives). Hence scorers have to be provided with comprehen¬ 
sive lists of permissible answers. Nevertheless, they have decided 
advantages over selective-response tests. The R.A.F. has used both 
these and selective-response tests of mechanical information (Mec- 
C) and aviation -t- engineering information (Gen-D). The latter 
obtained some of the highest coefficients of any test among pilots 
and air gunners, and was also the best single test in the U.S.A.A.F. 

Correlations between comprehension and information tests are 
so high (approximately •? in unselected groups) that they are 
clearly measuring niuch the same ability. Thus items of both types \ 
were included in the new naval recruiting centre tests. Questions 
involving specific trade experience Svere avoided in Test 117, and 
a separate series of Tests of Trade Knowledge was constructed, 
including verbal and pictorid questions which had been proved to 
differentiate between groups of men known to be skilled in that 
trade and an inexperienced group (cf. pp. 31-2, 45). 

iS.P.8, Assembly Test. —A teat of the Stenquist type, containing 
nine sets of parts to be assembled, was devised for the Army. Only 
parts which could readily be duplicated from Army stores were 
included. It was so arranged that one tester could test eight 
recruits simultaneously in about thirty-five minutes. 

The A.T.S. required an easier mechanical test and one was 
devised which involved stripping and reassembling seven Meccano 
models with the aid of full-sized pictures (S.P.24). In the Navy, 
a test of ability to bend a piece of wire with pliers into a given shape 
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was used in several investigations (S.P.103), but it was not applied 
regularly because the scoring against a “quality scale” was some¬ 
what difficult. These tests, too, could be given to small groups. 
Cox’s or Vincent’s models tests of mechanical comprehension were 
applied in certain naval and Army investigations, and in the R.A.F. 

All the above mechanical testa and some of the spatial ones were 
shown to be of value in the Forces in the selection of apprentice 
' tradesmen at 14r-16 years, and to add considerably to predictions 
based on academic examinations and intelligence tests, thus con¬ 
firming the results of Hunt and Smith (1946) and Holliday (1943). 
We have already described their .disappointing validity among 
malft adults and attributed it, partly to poor reliability, and partly 
to the distorting effects of the varied mechanical experience 
possessed by recruits from non-mechanical civilianl occupations as 
well as by tradesmen. Models tests appeared to be even less 
successful than paper-and-pencil or assembly ones. Since the 
specifically mechanical content of the latter was of little use, they 
acted as tests of all-round practical ability in much the same way as 
—or better than—spatial tests. Thus the assembly test obtained its 
highest validity coefficients (averaging *36 and ranging up to -68) 
when compared with tests of elebientary training amoiig infmtry, 
assault course marks in R.A.C., and proficiency in demolitions, 
field work, bridging, etc., in the R.E. 

Given more paper and materials, and more staff, better tests 
might perhaps have been developed. An important principle estab¬ 
lished in later investigations was to choose test items which contri¬ 
buted most to mechanic selection over and above ^ and education, 
in order to cover different ground from that predicted, say, by 
Arithmetic and Instructions. But mechanical ability is itself so com¬ 
plex that the search for a test which might be useful for professional 
engineers, for radio technicians, for fitters, for garage hands and 
for machine operators, does not seem very hopeful. Thfis Admiralty 
and War Office psychologists came to rely more and more on 
straightforward information tests. Tests like S.P.117E. and M. 
were often found to be more predictive than either Bennett or 
Arithmetic. Tests of trade knowledge were particularly useful in 
gauging the extensiveness of experience claimed by men who^ had 
already entered a civilian, trade, and its relevance to Service jobs. 
While it is true that these ar^ paper-and-pencil tests, they are much 
more reliable than the conventional practical trade test, and it was 
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sometimes possible to show that they give better forecasts of 
success in passing mechanic or electrician training courses. Objec¬ 
tions were sometimes raised that mechanics selected in this way 
might be lacking in the “practical knack,” but it is extremely 
dubious how far such an ability reaUy exists. The evidence amassed 
by psychologists appears to show, rather, that men may possess 
widely different degrees of practical ability in different jobs. Hence 
the best basis for prediction consists of records of work actually 
done. This problem of the nature, and consistency, of mechanical 
ability will be discussed more fully elsewhere. 

Illustrative Experiments 

The following validatory experiments illustrate some of the 
points made above. 

Radar Operators .—In 1943 correlations were obtained between 
the standard naval tests and success at the training course among 
603 radar operators. They are shown in the first column of Table 
'XXXVII, corrected for selectivity. This is the only naval study in 
which the coefficient for Matrices surpassed that for T2, though it 
equalled it among visual signallers. It was shown, too, that Matrices 


Tablb XXXVII. —^The Valve of Tests in Selecting Radar Operators 


Test 

Correlations in; 

1043 1D4G 

0 Matrices .... 



•42 

•80 

1 Abstraction 


• 

•34 

■27 

2 Bennett .... 


« 

•28 

•32' 

3a Arithmetic. 



•26 

■29 

3b Mathematica 



•80 

■36 

4 Squares .... 



■20 

•14 

T2. 



•30 

•36 

Group Test 70 (all parts) 




■32 

110 Beale and graph reading 




•42 

07 Memory for designs . 




•24 

109 Figure construction . 




•27 

118 Oscilloscope reading . 

• 



•21 


was most useful at the bottom of the scale for differentiating fails 
from passes, whereas Mathematics was more useful at the upper 
end. Failing the course depends chiefly on practical operating 
whereas high marks are determined to a greater extent by ability 
at theory. Note that Squares is the poorest test. Apparently the 
perceptual ability required among operators is not the same as k,. 

By 1946 the radar sets and the nature of the training had greatly 
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altered, and in a fresh experiment the standard battery along with 
several new tests was ^ven to 411 radar plot ratings. The (uncor¬ 
rected) correlations with practical marks are shown in the second 
column of Table XXXVII. Although Mathematics is now relatively 
more important, and ability to read scales and graphs gives much 
the best predictions. Matrices still does quite well. Group Test 70 
is similar. Two additional *-tests are better than Squares, but do 
not add appreciably, and a U.S. Navy oscilloscope reading test, 
based on discrimination of pictures of radar patterns, is of little 
value. 

Army Apprentice Tradesmen .—^Approximately 1,000 boys who 
took examinations and certain batteries of S.P. tests on entry at 
14 years were followed up over periods of one to three years, during 
which they had been trained as fitters (general, motor vehicle, or 
gun), armourers, electricians, instrument mechanics, carpenters, 
masons, or in other smaller trades. The courses were sufficiently 
similar, and the correlations in different trades sufficiently uniform 
to justify the quotation, in Table XXXVIII, of average figures. 

I 

Table XXXVIII. —^The Value of Tests and Examinations in Selecting 
Apphentice Thadesmen 


Examnation or Test 

Correlation 

Mechanical Test 

Correlation 

Arithmetic entry exam., 

•16 

2 Bennett . 

•29 

English entry exam. 


4 Squares . 

■29 

S.P.3A Arithmetic 

•13 

8 Assembly 

■27 

S.F.3A Mathematics 


(Meccano Assembly 

•22) 

S.P.17 or 26 Verbal 

•16 

(Mechanical interests . 

•87) , 

(S.F.21 Instructions 

•32) 



Matrices 

•16 



Oral directions 

•13 




While all coefficients are small, this was due largely to the unreli¬ 
ability of the available criteria. When divided into more and less 
reliable sets, the multiple correlations with the criteria were *48 
and -30 respectively. The tests listed in parentheses were applied 
to some 300 boys only, for whom the criteria were exceptionally 
good, hence their coefficients are not comparable -with the rest. 
Bennett, Squares and Assembly are the most generally useful tests, 
along with Mechanical Interests—a test on the same lines as 
Strong's Vocational Interest Blank. Arithmetic is of some value, 
but general intelligence. Directions, and verbal-educational tests 
(with the possible exception of Instructions) have little to con¬ 
tribute. The correlations for examinations are unduly-low since 
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these examinations were actually used for selection, but correction 
raises them only slightly. 

A similaf study With 860 naval artificer apprentices showed that 
the following tests have the best validities: 

S.P.2 Bennett, 117E. and M. information. 

Medium validity coefficients were obtained by: 

S.P.97 Memory for Designs, 4 Squares, 3b Mathematics, 
7 Kohs Blocks, 110 Cox models and 103 Wirebending. 

Very low validities were obtained by: 

S.P.l Abstraction, 3a Arithmetic and the entry examination. 
Radio and Electrical Mechanics and Wiremen (L).—Several infor¬ 
mation teats were given to some 300 naval radio mechanic trainees, 
214 electrical mechanics and about 100 wiremen (L)—that is 
civilian electricians who had passed a practical trade test on entry 
into the Navy. Correlations with training results are shown in 
Table XXXIX. Test 84 is a test of trade electrical knowledge, 
99 a test of knowledge of radio symbols, and 117E. of everyday or 
amateur electrical information. 


TAB1.E XXXIX.— The Value ov Tests in Selectino Eleotbical and 
R^dio Trainees 


Correlation among: 


Test 

Radio 

Mechanics 

Electrical 

Mechanics 

Wiremeri 

(i) '' 

T2. 

■29 

•61 

•41 

2 Bennett ..... 

•32 

•24 

■29 

3b Mathematics .... 

•24 

•64 

•40 

84 Electrical trade knowledge 

•33 

•33 

■46 

09 Radio symbols .... 

•36 

•29 

■42 

117B Electrical information . 

•46 

■46 

■36 

117M Mechanical information 

•21 

•30 

•30 

Initial trade test. 



•36 

F.S.O.s* judgments of suitability . 

•33 

■20 

— 


All three information tests surpass the standard battery among 
radio mechanics. With elpctrical mechanics, mathematics is especi-j 
ally important, but radio knowledge is unnecessary. Thus Tests 3b, 
117E, 84 and 117M have the highest coefficients. Aiiiong wire- 
men the trade test is less predictive than any of the information 
tests, Mathematics or T2, Similarly unsatisfactory validities for 
trade testa were found in other investigations of electrical artificers 
(direct entry) and cinema projectionists. 


















CHAPTER 


SPECIAL APTITUDE, AND TEMPERAMENT, TESTS 

Abstract .—Descriptions are given of tests of physical' agility, and 
of auditory capacities including aptitude for morse; also of tests 
involving apparatus for gunners, radar operators, aircraft pilots' 
and other special groups. The results obtained, particularly with 
the simpler tests, were mostly disappointing, and it appeared that 
more successful selection could be achieved with the aid of paper- 
and-pencil tests and “work-sample” methods. Some progress was 
made with the selection of recruits suitable as instructors, but very 
little in picking those with good or poor personality and tempera¬ 
mental qualities, such as leaders on the one hand, and emotionally 
unstable or psychiatric suspects on the other hand. Recent develop¬ 
ments in personality testing are reviewed. Projection tests and 
other indirect methods are valuable in skilled hands, and batteries 
of objective testa for certain “dimensions” of personality such as 
general neuroticism are feasible, though too elaborate for large- 
scale use. Neurotic inventories or questionnaires gave good results 
in the Forces, but would not necessarily work so well under civilian 
conditions. However, these, together with group projection tests, 
and mtcrest questionnaires, deserve further investigation. 

A number of specialised tests were tried out in the Forces and 
a few were used regularly, though none was outstandingly success¬ 
ful. Some of the results obtained with Chleusebairgue’s and other 
motor driver tests, with auditory tests for asdic operators, and with ’ 
the physical agility test in the Army, have already been quoted. 
The latter (S.P.16) was measured by the time taken to transfer a set 
of steel rings from two upright posts to two others standard 
distances apart, the mean time being approximately one minute. 
The test was devised to resemble the duties of gun numbers and 
infantry, and did yield significant but small correlations with pro¬ 
ficiency among the fatter (averaging ’22). But it was ve^ unreliable 
being much affected by the floor surface and the recruits' shoes, by 
their state of health at the time, and by the amount of enthusiasm 
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. the tester was able to stimulate. Even this test was found to depend 
to some extent on g. It had a certain propaganda value, but very 
little else. Several dexterity tests were tried out in mechanical 
branches of the Navy, Army and A.T.S., with ho success. A finger 
dexterity test (Co-ord.-C) which is being used in the R,A.F. has 
been described by Cockett (1947), 

Considerable research was devoted to tests for gunnery though 
it was hampered by the extreme unreliability of the available 
criteria. Instructors and officers could not say which recruits really 
were good or poor gunners, and objective proficiency tests were 
ruled out by lack of time and ammunition. Two miniature situation 
tests—Hotoph’s layers test and Craik’s predictor—did give promis¬ 
ing results, but it was impracticable to construct and maintain the 
apparatus needed for testing, say, a 1,000 men a week. Among 
A.T.S. recruits in anti-aircraft batteries, a set of nine sensory- 
motor tests gave correlations ranging from -f’IBS to —>106 with 
assessments of proficiency at a practice camp, whereas the Matrices 
test correlated -323 with the same criterion, and Table XXX shows 
that group paper-^d-pencil tests yielded even better predictions 
in later follow-up. In the Navy it was observed that marks awarded 
by the g\mnery oflBcer during the short gunnery course at seaman 
training establishments were as predictive of later success at a 
gunnery school as was T2 (correlations of 47). Of forty-three men 
receiving recommendations at their seaman establishment only one 
(2'3 per cent.) failed, whereas of 162 not recommended, 19-7 per 
cent, failed. This suggests that paper-and-pencil -1- work-sample 
testing constitutes a better approach to the selection of gunners 
than aptitude testing. 

Chapter XVI shows that the same conclusion is probably true 
of aircraft pilots. However, two complex co-ordination tests, 
devised by the Cambridge University Psychology Department, 
were found to possess some validity. S.M.A.3 (Co-ord.-A) requires 
the testee, seated in a mock-up cockpit, to keep a moving spot of 
light as hear as possible to the centre of a screen, during three runs 
of 1^ minutes each. He can adjust its vertical position by a control 
column, and its horizontal position by a rudder bar, so that gross 
co-ordination of both hands and feet are tested. The runs are 

\ scored by the amount of time the light falls outside a central square. 
The control of velocity test (Co-ord.-B) consists of a rotating 
ivorine cylinder, marked out as a winding road with “kerbs” and 
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“obstacles" punched in it. The testee steers a bronze ball along 
this road by a hand steering wheel. Each time he fails to avoid an 
obstacle an electrical counter is operated. The wheel is so geared 
to the ball that motor anticipation is essential, since rapid move¬ 
ments at the last moment cause it to wobble violently. The U.S. 
Army-Air Force, with its much greater resources of personnel and 
materials, was able to develop several other apparatus tests possess¬ 
ing useful validity for pilot selection. Nevertheless, its paper-and- 
pencil tests generally gave better'predictions (cf. Flanagan, 1946; 
Davis, 1947). Similar work was done by Biesheuvel for the South 
African Air Force. 

In yet another experiment on naval radar operators, two tests 
wfere found to add* appreciably to T2 in the prediction of training ■ 
marks. The Echo test, devised by Craik, provided a mechanical 
representation of the “scan” of a commonly used radar set and 
was scored by means of the smallness of the echo, or peak in the 
tracing, whidh the testee could place correctly. This had to be 
abandoned, however, since no method could be found for convert¬ 
ing it into a group test. The other test, Pointers, required testees 
to gauge the positions on a scale to which two arrows were point¬ 
ing. Forty such items were shown on large cards at an increasing 
rate, the object being to observe the testees* reactions to stress, and 
their tendency to breakdown or “flap.” Whether this really acted 
as a test of temperament could not be determined, but it certainly 
depended to a considerable extent on mathematical ability. Hence 
the scale and graph reading test, mentioned above, was developed 
instead. 

Morse Aptitude and Auditory Tests 

The Morse Aptitude test (S.P.IO) used for selecting Army and 
A.T.S. signallers during most of the war consists of seventy- 
eight pairs of sound patterns, presented by gramophone and head¬ 
phones. Testees judge whether the two patterns in each pair are 
the same or different, This is an American test—^the U.S. Signal 
Corps Code Aptitude test—dating from 1918. Since its reliability 
is inadequate, American psychologists sometimes trebled its 
length, but later discarded it in favour of morse learning teats. The 
U.S. Army “Speed of Response” test has now been adopted by 
the Royal Navy, and was proved in an Army investigation to 
possess superior validity. It, too, is given by gramophone records, 
The testees first get instruction and practice on three actual morse 
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patterns for seventeen minutes, and are then tested for the quick¬ 
ness and accuracy with which they can receive these patterns, for 
about ten minutes. Naturally recruits with previous experience 
have an advantage, hence it is supplemented with a short record for 
testing receiving ability at four and six words a minute. 

The old Morse Aptitude teat occasionally gave correlations of *30 
to ‘36 with receiving proficiency after training, but in extended 
follow-up its validity was lower and could be attributed largely to 
its dependence on intelligence. Correlations of Arithmetic, Clerical 
and Spelling tests with proficiency were usually superior. It was 
particularly defective in that its correlation was better at the top 
than at the bottom of the scale, i.e. it was fairly efficient in differ¬ 
entiating potentially very good from average telegraphists, but 
almost useless at picking out recruits who could never learn morse. 
Even the hew morse learning test, though claimed to have a 
validity coefficient of -49 in the American Forces, depends to a still 
greater extent on intelligence and does not appear to give good 
predictions beyond the early stages of training. Moreover, it was 
foTuid in the Navy and R.A.P. that speed and accuracy of receiving 
fluctuate erratically during training, and even from day to day. 
For example, accuracy at the sixth, tenth, fifteenth and twentieth 
weeks gave the following correlations with final accuracy at the 
twpnty-fifth week (20 words/min. rate)—-61, ’61, ‘06 and ■71. 
Several trainees with very poor accuracy of under 60 per cent, at 
seven .weeks (8 words/min.) eventually passed with 90 per cent, or 
over*. Still greater alterations occurred among telegraphist air 
gunners whose training lasted forty-two weeks, and among Army 
boy tradesmen. Thus it hardly seems likely that any test lasting 
only half an hour can be expected to give good predictions of 
eventual efficiency. 

A group auditory acuity test (B.P.19) was applied for a period in 
the Army and A.T.S. in the selection of signallers and certain 
categories of operators. Here a series of numbers is dictated by 
gramophone and headphones, with decreasing intensity. The 
reliability of the test was hopeilessly inadequate, and in only one 
group—A.T.S. special operators,—^was it ever shown to have any 
validity (a corrected correlation of only •24). Since this test has 

*_The reliability is doubtless lowered by the inadequacy of the methods of 
to^g^proficiency. However, in R.A.F. investigations the American objective 
Code Receiving testa were employed) and highly significant individual vari¬ 
ability was still found. 



SPECIAL APTITUDE, AND TEMPERAMENT, TESTS 261 

been adopted by several Education Authorities for identifying 
hard-of-hearing children, it is to be hoped that careful investiga¬ 
tion of its value will be made in schools before any reliance is 
placed on its results. Two gramophofie testa developed in the . 
R.A.F. are briefly described by Dickson (“Brit. Med. Bull”, 1947). 
One involves counting the numbers of pure-tone pips sounded at 
levels of intensity; the other requires recognition of 
words heard through a background of engine noise. 

Selection of Instructors 

Investigations were made in all the Services into the selection 
and training of instructors, those in the R.A.F. being the most 
extensive. Youthfulness, intelligence and education were found to 
be of considerable importance. Thus the Mathematics tes^ 3b 
' correlated -46 with assessments at the end of training among 161 
naval gunnery instructors; and among driving instructresses in the 
A.T.S., instructions, spelling, the “Mec” test and previous educa¬ 
tion yielded good results. But a particularly effective rniniature 
situation test was developed in the latter research. This consisted 
in giving the candidate a IVIeccano model to study, and then having 
her explain to a “stooge” how to make the model, given only the 
separate Meccano parts. Ratings by a psychologist who listened to 
the explanations of 146 candidates were considerably more pre¬ 
dictive than ratings, based on interview only, by a P.S.O., a 
psychiatrist, and a mechanical transport officer. Further work con¬ 
firmed the value of this method and demonstrated that it could be 
applied .equally successfully by specially trained F.S.O.s. But 
other, more objective, tests which were tried out as alternatives to 
tbia subjective rating showed no' promise. 

Personality and Temperament Tests 
In Chapter IV we 'described the importance attached to per¬ 
sonality qualities among naval and Army officer candidates, 
together with the interview and group task methods developed for 
assessing them. It would have been extremely useful to possess 
reliable tests which could be applied to all recruits, both for screen¬ 
ing cases with suspected neurotic or psychopathic teindencies (who 
could be sent for interview by psychiatrists), an'd for pagifg 
leadership, industriousness, and the like,'which were clearly ot the 
greatest significance in all naval and military employments. Even 
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if the worth of the W.O.S.B. techniques had been more fully 
established, it would have been impracticable to extend them to 
very large numbers. Hence the assessment of personality was based 
almost entirely on P.S.O.s’ interviews, whose inadequacy we have 
already admitted. The difficulties of personality testing have often 
been described, and these are enhanced when numbers are so large 
that the tests must be group ones, or—if individual—^very short, 
and when a considerable proportion of the testees are barely 
literate, and of tlie testers only moderately skilled. Little time could 
be devoted to the problem by qualified psychologists; thus, we 
cannot claim to have advanced very far. 

Actually a greater amount of relevant work was carried out by 
Eysenck (1947a) and his collaborators at the Mill Hill Emergency 
Mental Hospital, and some of his main methods and results will be 
summarised first. By means of factorial analysis, experiments and 
tests, chiefly with neurotic Army and A.T.S. patients, he was able 
to confirm the existence of two major dimensions or factors of 
personality, comparable to g and v : ed v. k :m in the field of 
abilities. These are; 

(1) General neuroticism, or stable and integrated v. maladjusted 
and poorly-organised personality. 

(2) Extravert v. introvert tendency, the extremes of which are 
represented by hysteria and dysthymia (anxiety or obses¬ 
sional neurosis) respectively. 

A number of tests were found to differentiate to some extent 
between- hysterics and dysthymics, or between maladjusted and 
normals. The latter included: 

(i) A personality questionnaire containing such questions as: 

Have you ever been off work through sickness a good 
d^l? Yes. No. 

Did you find it difficult to make friends ? Yes. No. 
This was presented as a "medical” questionnaire, in order to make 
the hypochondriacal questions more acceptable. But it also con¬ 
tains items bearing on inferiority feelings and lack of sociability, 
- i.e. characteristics sometimes regarded as introverted. Eysenck 
finds that these characteristics are associated with neuroticism (in 
his sense of the term) rather than with dysthymia, and the ques¬ 
tionnaire studies surveyed by Vernon (1938a) support him. The 
mean scores on forty questions among 300 normals and over 600 
neurotics were 2’6 and 10'6 respectively. 
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(ii) Body Sway, {a) Static Ataxia. The testee stands upright 
with his eyes closed, and a thread is attached to the back 
of his collar, and connects with a pointer which records the 
amount of sway on a smoked drum. Among 120 normals, 
none swayed more than 2 inches in 30 seconds, whereas- 
34^ per cent, of 900 neurotic swayed 2 to 6 inches ormore. 

(b) Suggestibility. A gramophone record is now started 
which reiterates the suggestion: “You are falling, falling 
forward. . . .” Sway is generally increased and some 
60-70 per cent, of neurotic patients now move more than 
2 inches, but only 6-10 per cent, of normals do so, 

(iii) Dark Vision. The Livingston rotating hexagon test, used 
in the R.A.F., showed that neurotics tend to have much 
poorer dark adaptation than normal recruits. 

(iv) Rorschach Inkblots (Harrower’s multiple-choice group 
test). The ten inkblots are presented by slides, each with 
nine possible responses, some commonly chosen by neu¬ 
rotics, some by normals. The testees rank these in order of 
appropriateness, and the sum of their rankings of the 
neurotic responses constitute their scores. 

(v) Word Association (adapted from Malamud, 1946). Two 
possible associations are supplied for each of fifty words, 
one characteristic of normals, one of neurotics. The score 
is the number of neurotic responses preferred, 

(vi) Perseveration Tests. It was confirmed that normals tmd 
to obtain moderate scores and neurotics very high or very 
low scores on ordinary motor perseveration tests. 

(vii) Persistence. In one simple test (similar to that of Femald, 
1912), the testee sits on a chair and keeps the heel of one 
shoe about an inch above the seat of a secohd chair as long 
as possible. The average times for hysterics and dysthymics 
were fourteen and thirty-one seconds, aiid for normals 
over onfc minute, 

(viii) Personal tempo or nomaal speed of Work on such tasks as 
manual dexterity or co-ordination tests was found to be 
slower among neurotics. Good results were obtained with 
the O’Connor Tweezer Dexterity test and the Track 
Tracer, where the subject traces a path with a metal 
stylus between rows of holes on an ivorine sheet, and'a 
buzzer sounds each time a hole is touched. 
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Eysenck has advanced some way towards the production of 
reliable batteries of tests for measuring neuroticism and hysteria- 
dysthymia, but not all the tests mentioned would be suitable for 
large-scale application, and evidence regarding the validity of some 
of them is as yet hardly conclusive. 

Indirect Tests 

One of the difficulties of personality testing is that, when subjects 
realise the object of the test (or if they misinterpret the object) they 
will always,tend to respond so as to put themselves in the most 
favourable light. Hence for many years both Burt and the present 
writer have advocated a more indirect, qualitative approach, where 
the subject reveals his personality or temperament by the manner 
in whidh he tackles various tasks—^usually tests of abilities. This 
was the principle adopted by Biesheuvel in his work with the South 
African Air Force, not yet published. He was able to show that 
carefully trained testers could make consistent judgments on speci¬ 
ally prepared rating scales, of qualities revealed by the subjects’ 
methods of approach, reactions to difficulties, etc., and that such 
judgments had useful predictive value in the selection of air crew. 
Moreover, several aspects of performance at sensory-motor or 
other practical tasks, which could be objectively measured, were 
found to be temperamentally significant. When, however, similar 
methods were tried out in the selection of U.S.A.A.F. pilots, the 
results were generally disappointing (cf. Davis, 1947; Guilford, 
1948). 

Another approach to the investigation of temperament was 
a.dopted by the Cambridge Unit of Applied Psychology. It was 
considered that temperamental stability or tendency to "flap” may 
be shown by the trend of performance in tests which impose pro¬ 
longed stress on the subject. One of these was the Pointers test 
(p. 249), another the Track Tracer, applied at maximum speed for 
seyeral minutes. A. third was a modification of the McDougall- 
Schuster dotting machine, where the holes which the subject tries 
to, hit with a stylus revolve past him at a constant rapid rate. The 
performances of the subjects on these tests in successive half¬ 
minute or other periods are recorded, and from thd upward or 
downward trend, or the irregularity, a somewhat subjective rating 
of temperament is reached. Adequate evidence as to the validity 
of these methods, is not yet published, and they would hardly 
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appear suitable for large-scale use, both because they involve a 
fairly lengthy session of individual testing, and because they could 
scarcely be entrusted to senai-trained testers. For the same reasons 
no attempt was made in the Forces to apply projection tests except 
to relatively groups such as officer candidates, although the 
original Rorschach inkblots, thematic apperception, and other pro¬ 
jection tests have come to be regarded, in recent years, as the most 
promising techniques for the diagnosis of personality. It was noted 
too, that tests of the projection type tend to be unpofiular both 
among testees and employers (e.g. Army officers) because of their 
apparent lack of relevance to the job. The psychologist cannot 
afford to disregard this factor of “face validity” completely. 

Another test providing measurable indications of such traits as 
calmness, foresight, initiative and persistence is Cattell’s (1941, 
1044) C.M.S. or cursive miniature situation test, which is an ela¬ 
boration of the dotting test. He gives striking evidence of differ¬ 
entiation between delinquents or other unstable personalities and 
normals. Though it failed to live up to its promise in a sinall trial 
with W.O.S.B. candidates, this may have been due to our inability 
to construct suitable apparatus and dotting sheets. It suffers, 
however, from a serious drawback, namely, that the scoring is 
extremely tedious and time-consuming. Only if this could be 
simplified or automatic electric scoring introduced, rt 

become a practicable test. As mentioned above, it is possible that 
the Clerical or Instructions test is another indirect measure ol 
certain temperamental qualities, comparable to Biesheuvels, the 
Cambridge, and Cattell’s tests. 


Tests Tried Out in the Army 

As already indicated, no precise data are available on the value 
of the projection tests used at W.O.S.B.s. An early experim^t, 
however, revealed significant differences between the grarnmaml 
forms of response to the group word association t^t among fif^ 
company commanders, fifty W.O.S.B. passes and fifty fails (of 
equivalent intelligence level). For example, the m^n percentages 
of complete sentence responses were 41, 31 and 19 per cen . 
respectively. No simple method of scoring either the fonn or con¬ 
tent of responses could be devised, which could be used by un¬ 
skilled testers, except the number of blanks or failures to respon ■ 
This was tried out at ail Army selection centre on 218 recruits, w 
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were interviewed by psychiatrists and assessed for temperamental 
stability v. neurotic tendencies on the Culpin scale. No correlation 
whatever was found, and failures to respond appeared to depend 
chiefly on illiteracy and on the testing conditions (e.g. the par¬ 
ticular tester). By contrast a “medical” questionnaire, similar to 
Eysenck’s but based on Part III of the Cornell Selectee Index 
(Weider, 1944), gave an agreement represented by a tetrachoric 
correlation of -31 with the Culpin scale ratings, which is moderately 
promising. 

The group Rorschach (ranking) test was given to 200 A.T.S. in a 
selection centre, fifty of whoin had been picked out as having 
personality problems by the P.S.O. and psychiatrists, 100 as inter¬ 
mediate, and fifty as being the most stable and well adjusted in the 
group, The median scores were 215, 223 and 234 respectively, 
corresponding to a correlation ratio of -26 (note that a low score 
indicates instability). If a borderline mark is fixed at 200, it would 
cut off 26 per cent, of the most neurotic and 12 per cent, of the 
remainder. Such differentiation is far from adequate, but it should 
be pointed out that the testees were a selected group. In general 
recruits are not sent to a selection centre unless they are in some 
degree maladjusted, yet, at the same time, few are likely to be in a 
serious neurotic state. Thus this result does not conflict with that 
of Eysenck who obtained median scores of 206 and 231 among 
neurotics and normals. The test was found to depend to some 
extent on g and v:ed (giving correlations of '33 with Instructions 
and Spelling). However, with the revision of the responses pre¬ 
sented to the testees, it might in combination with other teats be 
useful for screening purposes. A new inkblots test, suitable for 
group application, was prepared by Harvey, but has not yet been 
validated. 

It may seem curious that neurotics should differ from normals 
on sensory and motor tests such as dark vision and dexterity, 
though not on tests of mental abilities. But Slater (1944) claims 
that temperamental inferiority would be expected to be associated 
with constitutional physical inferiority, and has shown that signi¬ 
ficant differences occur on other tests such as Agility (S.P.16) and 
visual acuity. For example, a simple acuity test was given to 
109 neurotic patients and to 2,233 normals (none of whom wore 
glasses), and only 30 per cent, of the neurotic group reached or 
exceeded the normal median score. 
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Some experiments were carried out on gradings of recruits* by 
their own fellows, which appeared to show that they were capable 
of nominating promising candidates for commissions, who had not 
been picked up by the usual channels. Unfortunately, this tech¬ 
nique was received with some suspicion. Subsequent American 
work has shown that gradings by fellow cadets at officer training 
schools give quite good predictions of success in battle a year or 
two later (Baier, 1947). The correlations of *42 and -61 are dis¬ 
tinctly more promising than correlations obtained for W.O.S.B. 
or for O.C.T.U. grades. This suggests that it might be worth 
taking fellow employees’ opinions into consideration in managerial 
and other civilian selection. 

Personality Inventories and “Medical” Questionnaires 

Evidence regarding the validity of these questionnaires in 
civilian practice is decidedly unfavourable (cf. Vernon, 1938a; 
Ellis, 1946), and one would naturally expect them to be more 
unsuitable with low-grade, semi-illiterate recruits. Moreover, they 
tend to arouse suspicion among naval and military officers. Never¬ 
theless, they achieved surprisingly good results both in America 
and in this country, especially when given with medical backing. 
Conrad (1947) suggests that this was due partly to the greater 
heterogeneity of recruits than of most civilian samples, and partly 
because the questions asked are very similar to those which 
psychiatrists ask when they are assessing the neurotic tendencies of 
the testees. Possibly, too, average and sub-average men and women 
accept the questions more readily than sophisticated university 
students, on whom much of the earlier work was performed. Yet 
another factor is that civilians usually try to make themselves out 
as normal as possible, whereas recruits may adopt the opposite 
attitude. 

Two main types of validatory criteria have been employed, 
neither of them very satisfactory: 

(i) Assessments of neurotic tendencies in a miscellaneous group 
by psychiatrists. 

(ii) Differentiation between persons who have already developed 
neurotic breakdown and others who are presumed to be 
normal. Naturally, it does not follow that the neurotics 
would have given the same answers before they reached the 
stage of hospitalisation or discharge. 
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Nevertheless, one American study demonstrated the value of the 
National Defence Research Council’s personality inventory in 
differentiating recruits -who, in the following year, were discharged 
or committed offences, or were promoted. A long-term psychiatric 
follow-up of the Bennett-Slater questionnaire and the Harvey 
designs (inkblot) test has been undertaken in the British Army. 

While adaptations of the Cornell index and the N.D.R.C. 
inventory (short format) were “got across" to recruits in this 
country quite effectively, the most successful of such tests was that 
published by Bennett and Slater (1946), where the neurotic 
answers are ingeniously concealed. This is in ten sections, most of 
which gave significaift difference between neurotic and surgical 
patients at Sutton E.M.S. Hospital. The first three consist of 
ordinary questions, in about half of which a positive answer indi¬ 
cates neurotic tendencies, in the remainder a negative answer. 
Thus the testee who wishes to misrepresent himself cannot merely 
answer “no" throughout. The sections deal with symptoms of 
anxiety, hysteria and depression, respectively. The middle one was 
the least satisfactory. The next four consist of lists of “annoy¬ 
ances,” testees checking each item they find annoying. These are 
classified as follows: 

(1) Frustration of self-assertion, e.g. “Somebody tells you how 
to do your job.” 

(2) Personal inadequacy, e.g, “You forget what you’re looking 
for." 

(3) Dirt or untidiness, e.g. “An unmade bed.” 

(4) Noise, e.g, “The sound of hammering,” 

Nos. (1) and (3) are regarded as things which might annoy any 
normal person, but Nos. (2) and (4) are much more likely to affect 
neurotics. All types are mixed ih the test blanlc, and the score is 
based on the difference between them. 

The last three sections are revisions of Pressey’s cross-out test, 
where the subject crosses but items: 

(1) FQrwhichpe6pleshouldbeblamed,'e.g. "Flirting,speeding.” 

(2) Which they have worried about, e.g. “Loneliness, falling.” 

(3) In which they are interested, e.g. “Football, comedians;” 

In the first two sets items are chosen as likely to affect neurotica, 

and in the third as likely to appeal more to non-neurotics. 

Slater shows that when the sections are appropriately weighted 
a certain borderline score will cut off 61 per cent, of neurotics and 
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only 2 per cent, of normals. This needs, of course, to be confirmed 
on another, larger, group. A defect of the test is that it takes nearly 
an hour to give. Moreover, the scoring is so lengthy that it could 
never be used as a routine test. However, some of the sections and 
some of the items are more diagnostic than others, and a more 
practicable revision might well be constructed. American experi¬ 
ence suggests that a short twenty to thirty item test is as effective 
as a more elaborate instrument*. 

Another questionnaire tried out successfully on motor drivers 
(cf, p. 230), contained fifty miscellaneous questions such as: 

1. Can you ride a bicycle ? 

2. Do you thinlt all cars should be restricted to 30 m,p.h. ? 

3. Do you always take a torch with you in the blackout ? 

4. Do noise or vibration in a bus or train ever give you a head¬ 
ache? 

6. Do you prefer indoor to outdoor games ? . 

All of these differentiated good from bad drivers, perhaps for a 
variety of reasons. No. 1 may indicate “machine sense.” “Yes” 
answers to Nos. 2 and 3 suggest undesirable nervousness, while 
Nos. 4 and B are of the conventional psychosomatic or neurotic 
type. Such questions are less embarrassing than those in most 
personality inventories, and their object is not at all obvious—some 
of the “good” answers being ‘Yes,” some “No.” Though the 
scoring is relatively easy it was the time taken over this and over 
giving the test which prevented it from being more widely used. 

Interest Blanks 

Strong’s Vocational Interest Blank and Kuder’s Preference 
Record have been shown, in America, to give useful predictions of 
suitability for a number of professiopal and commercial careers, 
Probably fhey are much less effective among average and low- 
grade adults, for at these levels the responses to interest items 
depend to so large an extent on intelligence and education, or else 
on temporary fashions. For instance, the interests ticked by a 
skilled' instrument mechanic might resemble more closely those of" 
an equally intelligent clerk than they would those of a lower-grade 
metal worker, say a welder. Nevertheless, a test consisting of a list 
of jobs and leisure-time interests was standardised in the Amty on 

* Cf. Stuit (1047). This book contains an excellent discussion of the con¬ 
struction and uses of personality inventories. 
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recruits representative of the various training recommendations— 
drivers, mechanics, clerks, etc. Thus any new recruit’s interests 
could be scored for their resemblance to those of men in each of 
these types of employment. Though the scores had poor reliability, 
they showed useful predictive value in several experiments, and the 
test was abandoned mainly because of the time required for 
scoring. A test of this kind might have been useful with a higher- 
grade population such as officer candidates, but preliminary 
studies with passed and failed candidates in the Navy and Army 
did not reveal many consistent interest differences. The same find¬ 
ing was reached in work with U.S.A.A.F. pilots, and it was con¬ 
cluded that aviation and other interests are better assessed by 
information tests (Guilford, 1948). Probably, as Wilson (1945) 
suggests, the type of interest is of less importance than the serious¬ 
ness with which it is pursued, and it is difficult for a pencil and 
paper test to elicit this attribute for a large number of interests. 

Conclusion 

Our conclusions regarding the wlue of specialised aptitude tests 
in peace-time have already been indicated in Chapters X and XII. 
How about personality tests ? Clearly the prospects for large-scale 
application are not very bright. When ample time and skilled 
tester^ are available, and the number of testees is small, objective 
testing of general neuroticism and other "dimensions" is possible 
by batteries such as Eysenck’s. Trained psychologists, too, can 
make valuable personality diagnoses by projection tests and 
by observation of “manner” of performance, etc. For routine use 
by less qualified personnel, it seems that questionnaires could be 
constructed along the lines of Bennett-Slater’s and the motor 
drivers’, though they would need to be re-validated under the very 
different conditions of motivation present among applicants for 
civilian employments. Multiple-choice projection tests also cer¬ 
tainly deserve further exploration. Apart from these the most 
hopefuriine for assessment of personality would appear to be the 
development of better interviewing techniques, which could be 
partially standardised among different interviewers. 



CHAPTEH XVI 


MAIN R.A.F. SELECTION FINDINGS 

Abstract. —^This Chapter gives detailed information on the follov?- 
ing topics; 

1. Pilot training wastage before the introduction of grading. 

2. Relation between speed to solo and training achievement. 

3. Quarterly pilot training wastage from 1940 to 1943. 

4. Relation between grading score levels and elementary flymg 
pass rates. 

6. Layout of the original grading score card. 

6. Use of the raw grading scale by a number of different schools. 

7. Relation of grading performance to subsequent accident rate 
in pilot training. 

8. Relation of grading to age differences and failure rate, 

9. Composition of the original (April, 1944) air crew aptitude 
test battery. 

10. Reliability of tests in the aptitude test battery. 

11. Validation of aptitude test category batteries against different 
training stages. 

12. Interpretation of aptitude test results in relation to expressed 
preferences and Service requirements. 

13. Personality and character traits deemed important in air crews. 

14. Reliability of the above trait assessments. 

16. Average intelligence test scores for selected R.A.F. ground 
trades. 

16. Multiple correlations between G.V.K. scores and training 
results for a variety of R.A.F. and W.A.A.F. trades. 

The purpose of this Chapter is to present the findings on R.A.F. 
selection methods that are most likely to be of general interest. So 
far as possible the Tables will be left to speali for themselves, but 
a certain amount of verbal clarification will usually be necessary. 

Ait Crew Selection 

The chart (p. 263) shows what had happened after two years to air 
crew volunteers who had been accepted for pilot training by Aviation 

261 
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Candidates SelectionBoards. Out of every 100 cadets entering train¬ 
ing there were 41 who had qualified for operations, 17 who had 
qualified but were engaged on non-operational duties (primarily as 
instructors), 6 who were still under training, and 36 who had 
failed to qualify as pilots. It is clear that wastage was heavily con¬ 
centrated at the elementary flying (E.F.T.S.) stage which was being 
used as the main means of pilot selection. 


Table XL.— ^Relation Between Speed to Solo and Training Achievement 
Total caaes: 0 E.F.T.S.a, 30 couraes, N = 2,203 
Did not solo = B64 (24%). Did solo = 1,738 (70%) 


Group 

Hours 

Dual 

to 

Solo 

Total 

Cases 

Sample 

selected 

for 

follow¬ 

up 

Flying Ability 

E,FXr.S. 

1 S.F.T.S. 

Siipr. 

Med. 

Ii\f. 

Supr, 

Med. 

Inf. 


Hrs: 



% 

% 

% 

% 

% 

% 

V. fast 

5-8 

103 

160 

62 

' 46 

‘i 

44 

48 

8 

Fast 

0-10 

010 

160 

34 

58 

8 

41 

48 

11 

Medium 

11-14 

786 

160 

20 

00 

14 

20 

50 

21 

Slow . 

15-21 

160 

160 

0 

60 

35 

14 

54 

32 


The selection method known as Grading presumes a close 
correspondence between learning speed and subsequent achieve¬ 
ment and before it was introduced it was necessary to prove the 
reality of this association. Table XL which is based on the E.F.T.S. 
records of some 2,300 cadets during the summer of 1941 shows 
two things: 

(i) That almost a quarter of the cadets have to be suspended 
without going solo at all, 

(ii) That if those who do go solo are divided into four groups 
based on the number of hours required to reach this stage, 
consistent superiority is found throughout training by the 

• very fast over the fast, the fast over the medium and the 
medium over the slow!*. 

( The first of the above findings is in line with the conclusions of 
nearly all other Air Forces, that of every four volunteers who are 
medically fit and who appear in every way suitable for pilot train¬ 
ing, one will, in fact, be discarded in war-time without even going 
, solo. 

• The table yields tetrachoric correlations of '39 and ‘26 between time to 
solo and E.F.T.S. and S.F.T.S. results respectively. The first figure would be 
higher if all pilot applicants went to E.F.T.S, and the second much higher if 
they all passed on to S.F.T.S. 
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Fig. 12. Flight Test (R.A.F.) E.F.T.S. 
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Up to and including the first quarter of 1942 flying training 
schools were dealing with cadete who had been nominated as pilot 
trainees by selection boards using the “impression” method. This 
gave a gross pre-O.T.U. failure rate of about 40 per cent., when 



NUMBER 
OF CASES 
272 


ri PASSED ■ FAILED 

Ue.f.t.s. ■e.f.t.s. 


Fig. 13. Flight Test (R.C.A.F.) Special Ghatino Expekimbnt, 


R.A,F. schools were in the U.K., rising considerably higher when ' 
they went overseas. In the second quarter of 1942 many of the 
cadets had received a limited amount of casual flying instruction 
before going overseas; the failure rate then became about 36 per 
cent. In the third quarter all cadets had been through the grading 
procedure, but draft commitments made it necessary to send 
nearly all overseas whether they had shown much or little aptitude 
for flying—^under these conditions the subsequent failure rate came 
down to 30 per cent. By the fourth quarter of 1942, grading came 
into full operation, only those who had demonstrated a relatively 
high degree of flying aptitude going forward. This method of 
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“quiilitii: selectioB’' reduced the overseas Mlure rate abruptly to 
15 per cent. The subsequent increased vnstage rates in mete 
the result of an attempt to secure higher quality by failing border¬ 
line cases who would hitherto have been passed. This policy was 
not successful and counter-measures had later to be tahen to 
restore the original standards. 

. Figs. 12 and 13 show the relationship between degrees of 
grading proficiency and E.F.T.S. performance, OrdinatEy no one 
in the lowest three grading groups would be sent forward and com¬ 
paratively few from No. 4, The results in Fig. 12 are based on 
neatly 15,500 pEots and show a steady diminution in the E,F,T.S. 
Mure rate as the higiier grading groups are reached, Fig. 13 
records similar data for the only complete grading population sent 
forward; the numbers ate smaE (212 in &), but the increase in 
fidhite of Groups 1 and 2 is large enough to command attention. 
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_R.A.F. Form 326(1) 


NO....E.F.T.S. 


Instructors: 


GRADING ASSESSMENT 
FORM 


Test No. 1 


Rank. Name. 


Dates grading 



Solo Before Test 

Solo Immediately After Test 

Not Solo 


Taxying and Tarmac Check 
Tarmac Check 


Use of Controls and Eng.. 


Safety-speed and look-out 


Take Off and Climb 
Pre take-off airmanship . 


Use of elevator 


Keeping straight 


Use of engine. 


A.S. & R.P.M. on climb . 


First Approach 

Position to commence glide 


For Office Use 

. Total Marks . 

Ejctended by . 

Checked by.. 

Time to Ist Solo . 

Time to Test. 



01234 - 6789 10 (2) 


0 12 3 4 


0 1 2 3 4 - 


6 7 8 0 10 


7 8 0 10 


01234 - 6789 10 (1) 



Selection of turning point 01234 - 6780 10 


Maintaining correct speed. 


Adjustment of glide path . 


First Landing 
Selection of landing path . 


Judgment of check height 


Handling during hold off 


Quality of landing attempt 


Subsequent actions . 


01234 - 6780 10 (2) 


01234 - 078 


0 12 3 4 


7 8 0 10 


0 12 3 4 


Carried Forward 




























































o 
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Medium Turn (Right) 

Going in and looking round 

b Accuracy of turn to bank 

Maintaining constant bank . 


Coming out w’out skid or slip 


Medium Turn (L^t) 

Going in and looking round 


Accuracy of turn to bank 


Maintaining constant bank .. 


Coining out 


Ste^ Turn With Engine 
Going in and looking round 


Use of throttle . 


Accuracy of turn to bank 


Keeping height . 


Coining out 


8 Medium Gliding Turn (R) 

a Increase speed before turn . 

6 Accuracy of turn to bank . 

Maintaining correct speed . 


Coming out at correct speed 


0 Medium Gliding Turn (L) 

a Increase speed before turn . 

h Accuracy of turn to bank 

Maintaining correct speed . 


Coming'out at correct speed 


Second Approach 
Position to commence glide 


Selection of turnii^ point . 


Maintaining correct speed 


Adjustment of glide path 


Brought Forward 


01234-0780 10 (1) 


4 - 6 7 8 0 10 


012 3 4- 0789 10 


01234-078 


0 1 2 3 4 - 0 7 8 0 10 (1) 


01234-6780 


01234-6789 10 



0 12 3 4 


0 12 3 4 


01234-6780 10 (1) 


234-6780 10 


01234-6780 10 


01234-0780 10 


Carried Forward 
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Brought Forward 



11 

Second Landing 




a 

Selection of landing path 


(2) 


h 

Judgment of check height . 

01234-6789 10 

(2) 


H 

Handling during hold off 

01234-6789 10 

(3) 


H 

Quality of landing attempt . 


(6) 


■ 

Subsequent actions 


(2) 


12 

General 




a 

Alertneaa .... 

01234-6789 10 

(3) 


b 

Handling and control co- 





ordination 

01234-6789 10 

(2) 


c 

Style ... 

01234-6789 10 


■ 



Total Marks 

5 

9 


The canstitution of the flight test is most easily conveyed by a 
study of the scoring card, the original version of which is shown 
here. From this it will be seen that the test included twelve sections 
the majority of which correspond to specific flight manoeuvres. 
From three to five features are assessed, for each manoeuvre, every 
feature being marked on a 10-point scale. Differential weightings 
(1-6) are accorded to each detail, a hundred weighting units in all 
being distributed among fifty items. 

It had been anticipated that, despite all efforts to standardise 
flight test procedure, the actual marks assigned at diflerent schools 
would show differences which were really due to divergent marking 
habits. Fig. 14 demonstrates that this in fact proved to be the case; 
had the schools assigned the same marks to the same sort of per¬ 
formance in precisely the same way, the lines joining the quartile 
points would (sampling errors in the basic quality of the groups 
apart) have been horizontal. It was also demonstrated that when 
the marks assigned at any one school were studied over a consider¬ 
able period of time, marking standards again showed considerable 
change. These two sources of instability were combated by the 
introduction of school conversion tables. Each table was based on 
the performance of the last hundred cases to come through a given 
school and involved the use of a rectilinear 20-point scale. Thus a 
score of 20 would correspond to a performance within the top 
6 per cent, of the population whatever the raw score allotted, This 
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SCHOOL: miEliniinillllElEDilllllgllDIBlIHlIBIDIIl 

1 , 000 ]- 



Fio. 14. Second Flight Test Maeks, Apml, 1042. First 100 Cadets 
Graded at Each School. 


solution ms based on the assumption (demonstrated both by a 
priori reasoning and empirically) that the risk of a hundred 
randomly chosen cadets differing sharply in quality from that of 
the general population was small enough to be disregarded. , 

Originally equal weight was given to the two flight tests, i.e. a 
cadet’s grading score was reached by adding the two converted 
figures to yield a total somewhere on a scale, ranging from 40 to 2. 
Enquiry soon showed however that the later test was more pre¬ 
dictive than the first. As soon as it was administratively practicable 
a third test was introduced with a view to strengthening reliability, 
and at this point the scoring was modified in two ways; decile con¬ 
version tables were brought in in place of half-deciles, and in view 
of its higher predictive power the last test received a double 
weighting. 

The reliability of the individual flight tests was measured in a 
couple of experiments during which secondary tests, given by 
. independent assessors, were introduced into the programme. The 
first of these, following immediately after the seven-hour flight 
test, yielded a test-retest correlation of *706 (360 cases). The second 
additional test was given midway between the first and second 
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FLIGHT TEST PERFORMANCE AT GRADING IN U.K. '■ 

Fio. 16. Rblation of Ghadino Pebfobmancb to Subsequent Accident 
Rate in Pilot Training (R.A.F. Graded Cadets Through S.F.T.S.). 

flight tests so that perfect test-retest conditions cannot be claimed. 
The correlations (292 cases) were respectively -676 and 'Ses. 
The relationship between the seven and eleven hour tests were also 
calculated on these groups, the figures, being -620 and -501. None , 
of these figures must be identified with the reliability of the full 
grading procedure which, since it then contained two tests and 
now, has three, is almost certainly higher. Fig. 16 (based on J,000 
cases) shows that cadets who received high grading assessments, 
subsequently showed a relatively low accident rate in pilot training, 
while subsequent accident rates for candidates with lower grading 
assessments increased sharply and steadily. 

Fig. 16 shows the relationship between age and flying failure 
rates for both graded and ungraded populations. It might have 
been expected that a selection method as successful as grading has 
proved in other directions would have automatically levelled the 
failure rate for those in the'different age groups who survived it. 
It will be seen, however, that those aged 29 and over who were 
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Fig. 16, Eiution of Gbading to Age Differences and Failtjee Rate. 


accepted for pilot training as a resnltiof grading had nearly twice 
as high a failure rate as the lowest age group, and it therefore 
appears that grading does not sufficiently penalise cadets at the 
older ages. As, however, the R.A.F. does not normally take men of 
this age for pilot training, this unsolved problem is now of no 
practical account. 
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Table XLI,— Aih Chew Aptitude Test Battery (April, 1944) 


Test Description 

Short 

Name 

Hi 

■ha 

S9 

Q 

fI 

1 

0 

a 

a 

1, General Intelligence 



i 






G.V.K. . 

Gen, A 

Oxford Univ. 







2. Educational Attain- 




6] 

3N1 

FRA 

\ 


ments 

Gen.B 

Air Ministry 




1 



3, Languageaudjudgment 

Gen. C 

Air Ministry 

J 






4, Aviation Information . 

Gfin. D 

U.S.A.A.F. 

m 

[PKIUMRMTAT. 

S. General Maths. . 

Mat. A 

U.S.A.A.F. 


3L 





6, Mathematical Reason- 









ing 

Mat. B 

U.S.A.A.F. 


X 

X 




7. Approximations . 

Mat. C 

U.S.A.A.F. 


X 





S, Speed in Calculation . 

Mat. D 

U.S.A.A.F, 


X 





0. Elementary Calcula- 









tions 

Mat. E 

Air Ministry 







10, Table Reading . . 

Mat. F 

U.S.A.A.F. 


X 

X 


X 


11. Mechanical Compre- 









hension (Pictures) 

Mec, A 

U.S. Navy 

X 



X 

X 

X 

12. Mechanical Compre- 

Mec. B 

Air Ministry 







hension (Diagrams) . 


from U.S.A.A.F. 

X 




X 

X 

13. Technical Information 

Mec. C 

R.A.F. 




X 

X 


14. Dial Reading 

Ins. A 

U.S.A.A.F. 

X 

X 

X 

X 

X 


16. Instrument Compre- 









hension 

Ins. B 

U,S.A.A.F. 

X 





X 

10. Aerial Photographs 

Obs. A 

U.S.AA.F. 

X 

X 

X 




17. Map Reading 

Obs. B 

U.SA.A.F. 

X 

X 

X 




18, Aircraft Silhouettes 

Obs. C 

U.S.A.A.F. 

X 





X 

19, Pilot Co-ordinator 

Co-ord. A 

Central Medical 







(S.M.A.3) 


Establishment 







' 


(R.A.F.) and 









Cambridge Univ. 

X 





X 

20 Control of Velocity 









(C.V.T.) 

Co-ord. B 

Cambridge Univ. 

EXPERIMENTAL 

21. Finger Dexterity 

Co-ord. C 

U.S.A.A.F. 


1 

|x 

|x 

lx 

' -v 

22, Turret Manipulation . 

Co-ord, D 

Air Ministry 

EXPERIMENTAL 

23. Morse Record . 

Morse A 

U.S. Navy 









(modified) 




lx 

J_ 




Table XLI shows the content of the two-day testing programme 
introduced in April, 1944 to yield aptitude measures for each of 
the air crew categories. The columns on the right show the tests . 
originally used to elicit these category indices. The source column 
m^es clear the extent of our indebtedness to the United States 
Army Air Force. Two of the experimental tests (Gen, D and 
Co-ord. B) have since been validated and introduced into the pilot 
battery. For details of length, timing and reliability the reader is 
referred to Appendix II. 
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Tabx3 XLII.— ^Vaudation of CATEcony Indices 


Category 

Criterion 

No. 

Significant 
(•01 level) at 

Correlation 

' 


Uncorrected 

Corrected 

PILOT 

Grading 

Initial (Ground) 

923 

1 10 

1 -202 

•474 


Training 

ail 

1 -104 

1 -285 

•473 

' 

No Basic Training Results 

to hand 


NAVIGA- 

Initial Training 

QOO 

•12 

•475 

■(136 

TOR 

Basic Training 

204 

•18 

'511 

•087 

air 

Initial Training 

229 

•20 

•160 

-220 

BOMBBR 

No tested subject came through the basic course of this now 
obsolete trade* 

WIRELESS 






OPERATOR 

No 

meaningful training criterion. 


FLIGHT 

ENGINEER 

Initial Training 
Technical 

702 

•11 

•487 

•602 


Training (Phase 1) 
Technical 

437 

•14 

•107 

•280 


Training (Phase 2) 

398 

•15 

■224 

•326 

AIR 

Initial Training 

473 

•14 

•366 

•414 

OONNER 

1 No Basic Trdning Results 

to hand. 



To reach the different category indices the raw scores for each 
test were converted on to a 9-point (Stanine) scale. Twenty weight¬ 
ing units were differentially distributed among the tests in each sub¬ 
battery so that all the resultant indices were expressed on a com¬ 
mon scale with theoretical range from 180 (9 X 20) to 20 (1 X 20). 
As a number of the tests had not formerly been tried out on R.A.F. 
personnel the initial weightings had to be based on evidence from 
the countries of origin. An all-through validation plan was drawn 
up as soon as the battery was brought into action diid the infor¬ 
mation yielded by the very intermittent post-war training pro- 
, grammes is condensed in Table XLII. The corrected correlations 
were reached by applying univariate correction; the corrections are 
naturally largest where the populations selected for a category are 
homogeneous (as with pilots and navigators) and smallest where 
the selected population approximates to the basic population (as 
with flight engineers). It will be seen that all the corrected correla¬ 
tions are highly significant while all but one of the uhcorrected 
'figures (that for air bombers) is significant at the -Ol level. This 
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Single exception is of no great concern as the navigator index was 
always applied as a preliminary screen for the air bomber tra^ 
and the uncorrected figure yielded by this index for the same 
population exceeds the critical point. 

It will further be noted that the figures for the navigator category 
are considerably greater than for any dther. This is probiblv 
because intelligence and educational attainment are highly relevant 
for navigators whereas these general abilities (granted a certain 
minimum) are of relatively little account elsewhere. Where this 
latter situation occurs low but sigmficant correlations for tests of a 
more specific character naturally acquire an importance they 
cannot otherwise claim. ^ 


Lack of staff has hitherto made it impossible to carry out much 
development work both on the tests individually and the battery 
as a whole. Thus the size of the battery still suggests something in 
an experimental stage, and there is little doubt that it could be 
reduced to a compactor form if more labour could be devoted to 
research. The same limitaltion has prevented many more detailed 
investigations particularly in the field of item analyses. 


Table XLIII— ^Pocket Classification 


Candidate 

Ne 

tv. 

WIO 

m 

AIG 

Index 

Pref. 

Index 

Pref. 

Index 

Pr^. 

Index 

Fr^. 

1. Green . 

166 

B 

122 

D 

137 

C 

149 

A. 

2. Brown . 

130 

A 

09 

B 

117 ■ 

C 

146' 

D 

3. Grey 

90 

A 

106 

B 

101 

C 

109 

D 

4. White . 

109 

B 

79 

A 

08 

C 

102 

D 


Table XLIII is intended to give an idea of how testing results 
were finally interpreted, A classification involving only four people 
is postulated (weekly intakes in 1944 frequently exceeded 1,000). 
The guiding principle has always been: “Preference granted if 
aptitude and Service needs permit.” In this case Service needs may 
be taken to be for one navigator, one wireless operator, one flight 
engineer and one air gunner. Preferences are indicated by letters 
(A first choice, B second, etc.) and indices by numbers. Green, in 
spite of his veiy high navigator mdex> will become the gunner by 
virtue of his preference for this category. Brown with.an A prefer¬ 
ence and a high index will become the navigator. Grey’s aptitude 
is too low for his A preference, but just adequate for his B, so he 
becomes the wireless operator. The berths for White’s A and B 
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preferences are already filled and his aptitude for C (flight engi¬ 
neer) is just adequate (had it been much lower it would have been 
necessary to recommend leaving this berth unfilled), A compli¬ 
cation arose in the classification of pilots for whom grading con¬ 
stitutes a secondary screen. This was met by giving an alternative 
classification to which approximately the lower half on the grading 
tests reverted. 

So far comparatively little work has been undertaken on the 
problem of personality assessment for air crew. Throughout the 
war little could be done to standardise interviewing procedure, but 
late in 1946 a first step was taken with the issue of a questionnaire 
to 362 highly experienced air crew representing all commands and 
categories. The purpose of this was to find (a) what degree of 
agreement existed as to the relevance of different traits for air 
crew, (J) whether there was any correspondence between impor¬ 
tance and assessability at interview. Table XL!^V gives the pooled 
result of the 362 orders. 

• 

Table XLIV. —Pebsonauty and Ciiauacteu 
T nAira Deemed Important in Air Crew 

(Pooling of Orders in 3B2 Questionnaires) 


1. Calmness ,. ,, ,. ,. 124!) 

2. Dependability .. .. .. .. 1502 

.3. Determination.1847 

4. Initiative .. ,.. .. .. 1020 

6. Keenness.. .. .. .. .. 22n2 

0, Confidence . J .. . > .. 2200 

7. Co-operation .. .. .. .. 2360 

8. Discipline (~ Self-discipline) .. .. 2427 

0. Decisiveness ,. .. ., ., 2408 

10. Aggressiveness .. .. .. .. 2023 ' 

11. Respect and Influence .. .. ,, 4082 

12. Sense of Humour .. .. , , 4377 

13. Powers of Self-Expression .. .. 4043 

14. Acceptability .. ,., ... 4002 

16. Appearance and Bearing .. .. 4728 

16. Breadth of Outlook .. .. .. 6308 


This Table shows first a sharp cleavage in believed importance 
between traits 10 and 11, and secondly, the agreed unimportance of 
the traits most easily assessed in interview (e.g. traits 13,14 and 16). 
The problem now takes the form: is it possible to build up a reliable 
standard interview in which assessments of only the traits held 
important are to be rated, or is it necessary to abandon the cross¬ 
table interview and substitute a much more elaborate procedure, 
probably on W.O.S.B. lines ? It was decided to look for ,a solution 
of the first type, and a rating scale comprising traits 1-7 and 9 was 
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drawn up. (Nos. 8 and 10 were omitted after prolonged discussion 
and one fresh trait “Attitude to others,” was introduced as a result 
of suggestions by those answering the questionnaires.) Two 
reliability experiments were then undertaken (1) between the 
assessments of interviewer and observer at a single interview 

(2) between interviewers at successive interviews, ’ 


Table XLV. —Reliability of Tbait Assessments 


Tfiiit 

Ittd^endent 
assessments at 
one interaiew 

Independent assessments 
at successive intervieais 

Interviewer A v. 
Interviewer B 

Calmness ..... 

(N = 64) 

■40 

. (N = 87) 

•33 

Confidence .... 

•72 

•38 

Dependability . ' . 

•08 

•28 

Co-operation .... 

•77 

■60 

Attitude to others 

■42 

•23 

Decisiveness - . 

•63 

•43 

Determination .... 

•44 

■42 

Initiative. 

■58 

•60 

Keenness ..... 

•47 

■29 


The numbers in both experiments were unavoidably very small. 
The results on the whole conform to expectation, the coefHcients 
in the first column being in each case higher than their counter¬ 
parts in the second. The outcome of these findings is a restrained 
optimism about the new type of interview, on which much 
development work remains to be done. Meantime the following 
points should be noted: 

(1) No interviewer undertakes the new approach without pro¬ 
longed briefing and practice. 

(2) No one is automatically accepted or rejected on the new 
interview. 

(3) Apart from the introduction of a new rating scale based on a 
systematic analysis, the new interview differs from the old, 
(i) in being conducted individually, (ii) in making a more 
general approach (e.g. the assessment of qualities is based 
on a general appraisal of character fatlier than the can¬ 
didate’s attitude to Service life). 

(4) It is realised that the listing of character traits may appear 
• to savour of atomism. The importance of considering the 
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relationship between traits and of considering personality 
as a pattern isi however, strongly underlined in the briefing 
of interviewers. 

' Selection for Ground Trades 

A battery of three intelligence tests (G.V.K.) and a simple 
arithmetic test has been given to the full ground population since 

1.942. The combined G.V.K. scores ^ are used as a 

3 

broad indicator of quality. Table XLVI is one of the guides 
originally given to interviewers to assist them in interpreting 
scores. The trades (some of them now obsolete) are divided into 
six groups on a “general calibre’* basis. Against each group the 
average G.V.K. levels for “assured success" and “failure" are 
quoted, the columns on the right giving proportions of good 
(L.A.C. or A,C.l), moderate (A.C.2) and poor (ceased training) 
trade training results actually found for each level. The probability 
of success above the success level is of the order 20-1 and below 
the failure level only about 3-1. The scores are the averages of 
three percentiles based on a large random 1942 entrant population. 

When the interviewer has satisfied himself on a recruit’s broad 
grouping a study of his profile scores can often assist him in the more* 
delicate business of recommending to a specific trade in the group. 
The most obvious application of profile interpretation will be the 
separation of High V—Low K from Low V—High K cases, the 
former sub-group being more likely to succeed in the clerical and 
the latter in the practical types of occupation. 

Table XLVII gives G.V.K. multiple correlations (both before 
and after correction for selection) in respect of thirteen R.A.F. and 
eight W.A.A.F. trades. These calculations were based on training 
results in 1944 and 1946 the populations for each trade numbering 
between 100 and 300 (usually 200). 
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Table XLVI.— Showing Average G.V.K. Scores, for RAP 

Trades ' ' ' 


279 

Ground 






Trade 

Average 

G.V.K. 

level 

far: 

Average 

G.V.K: 

Score 

Meteorologist 

Radio Wireless Mechanic 

‘assured 

success* 

80 

R.D.F. Wireless Mechanic 

‘failure’ 

60 

Photographer II 

Clerk Special Duties 

Clerk General Duties 
Radio Direction Finding 
Op. 

Clerk Accounting 

‘assured 

success’ 

70 

Link Trainer Instructor 
Wireless Op. 

W./T. (Slip Reader) Op. 

‘failure’ 

40 

Flight Mechanic II 
Armourer (Bomb or Gun) 

Instniment Repairer II 
Radio Telephone Op. 

‘assured 

success* 

60 

Electrician II 

Armoured Car Crew 
Physical Training Instr. 
Teleprinter Op. 

‘failure’ 

30 

Motor Boat Crew 
Equipment Assistant 
Motor Transport Meoh. 
Motor Transport Driver 

‘assured 

success* 

60 

Ground Observer 

Service Police 

Medical Orderly 
Telephone Op. 
Torpedoman 

‘failure’ 

20 


(v) Ground Gunner 
Fabric Worker 
Cook and Butcher 
Groundsman 

(vi) Batman 
Balloon Op. 
Aircrafchand General 

. Duties 


Training results, shoui- 
m % of carmen at each 


L.A.C 

or ‘A.'c\ A.C.2 C. 



46 65 0 


2‘5 66-6 31 



Failure score average G.V.K. of leas 

I thanB I I 
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Tabi^ XLVII.—Multiple Coriielationb Between G.V.K, Scores and 
Training Results por a Variety or R.A.F. and W-A-A-F. Trades 


Trade 

R.A.F. 
Multiple r 

W.A.A.F. 
Multiple r 

before 

correction 

itfter 

correction 

before 

correction 

after 

correction 

Radio Wireless Mechanic . 

•64 

•76 



Radar Operator 




•68 

Radio Telephone Operator. 

•61 



•66 

Clerk Provisioning ... 

•40 

•68 



Equipment Assistant. 

•40 

•54 



Wireless Operator 

•61 

— 

■43 


Wireless Telephone Slip Reader. 

•14 

— 

•37 

•66 

Telephone Operator . 

•37 

•50 

•36 

■48 

Flight Mechanic (Airframes) 

•30 

•47 



Flight Mechanic (Engines) 

•37 

•46 



Photographer .... 

•29 

•44 

■46 

•60 

Carpenter II ... . 

•33 

•40 

•28 

■43 

Electrician II , 

. -24 

•36 

■34 

•62 

Aircraft Finisher 

•26 

•32 




This Table makes clear that for certain high-grade trades G.V.K. 
alone can offer excellent predictive value. It was, however, never 
intended that it should do more than supply pointers to the trade 
allocation problem for which a series of more specific tests is also 
required. Some such tests have been used to supply corroborative 
evidence for special trades, but the opportunity to develop a ground 
trades battery analogous to the air crew aptitude battery has not yet 
arisen. When this does occur there will be several major problems 
to be tackled simultaneously with the laying-on of tests, viz.: 

(1) Evaluation of the strength and meaning of trade preferences 
vis-d-vis demonstrated aptitude. 

(2) Intensive briefing of interviewers as to the reliance that may 
be placed on the results of a more elaborate testing procedure. 

(3) Separate consideration of the volunteer and National Service 
entrants. 

The complexity of the ground trade"" situation (many trades, 
changing quotas, civilian trade experience, etc.) makes the overall 
validation of a selection programme inordinately difficult; but 
against this it may be argued that the complexity is itself a reason 
necessitating scientific selection. With such a wide range of human 
material to place and such a variety of allocations to be made the 
possibility of misplacement is maximal and, without the moat 
careful planning, unavoidable. 
















CHAPTER XVn 

CONCLUSIONS 

The following series of brief statements summarises the con¬ 
clusions based on personnel selection work in the Forces which we 
regard as having vocational or educational applications in peace¬ 
time. They are put forward somewhat dogmatically, but are 
accompanied by references to pages in the text where the relevant 
evidence or discussion has been presented. 

Vocational Psychology 

1.1. Psychology has many applications to contemporary human 
problems. It has proved itself particularly useful in the prediction 
of children’s and adults’ educational and vocational capabilities 
( 11 - 12 ). 

1.2. There is little to be learned from German vocational 
psychology of the 1930s, except from its mistakes. But the appli¬ 
cations and the technical development of psychological selection 
methods in Apierica are generally more extosive than in Britain, 
At the same time there are several aspects of American vocational 
psychology which should be viewed with reserve; for example, the 
extreme emphasis on tests, the almost universal use of selective- 
response test items, the analysis of abilities into independent 
factors, the mass production of clinical psychologists, etc. (13-24, 
9^96, 109-170). 

1.3. The greater advances of applied psychology in America 
than Britain are due partly to the greater resources of men and 
materials, but also to the greater prestige in which psychology is 
held by Government and Service authorities and by the public. 
Its popularity here has now much increased (probably as a result 
of its war-time achievements), and there are far too few qualified 
psychologists to meet the demands from industry, education, the 
Services, the universities, etc. (18, 23-24, 100-102), 

Administrative and Other Considerations 

2.1. The introduction of psychological methods does not involve 
arbitrary or bureaucratic direction of human beings. On the 
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contrary psychologists stress the individuality of character, abilities 
and interests, and try to arrive at recommendations which will best 
suit both their subjects or candidates and the prospective employers 
or teachers (11-12, 86-89). 

2.2. Psychologists do not wish for executive authority in carry¬ 
ing out selection or other schemes. They prefer to act as technical 
advisers, who provide the methods and train the personnel officials 
or teachers to apply them, but who leave the final decisions to the 
users. They regard it as unpsychological to impose a ready-made 
scheme on an industrial firm, an educational authority or an 
institution like the Army. They are willing to educate the users 
gradually, and to prove the worth of their scientific methods by 
the results they obtain, as they go along. At the same time they can 
hardly be expected to do their best work if their advice is con¬ 
stantly over-ruled, their methods misapplied, and if they are not 
allowed to administer their own afltairs (27-29, 41-42, 64-66, 
92-94, 100-102). 

2.3. Vocational and educational psychologists have shown them¬ 
selves capable of assessing social and emotional factors in their 
child or adult subjects, in addition to abilities, without having to 
refer any but seriously maladjusted cases to persons with psychia¬ 
tric training. Nevertheless, so many selection, and other industrial 
and educational, problems involve the deeper, unconscious, factors 
in huinan motivation that co-operation between the psychological 
and psychiatric approaches is desirable. Another important con¬ 
sideration in deciding the r6Ie of these two professions is their 
relative acceptability to the users (20-21, 33, 68-60, 93-94). 

Broader Psycholdgical Aspects of Vocational Classification 

3.1. Vocational or educational classification schemes should con¬ 
sider the needs of the institution (industrial firm, Army, etc.), and 
of its members as a whole. Vocational selection which merely 
attempts to pick the best men for one job, regardless of the needs 
of other branches of the institution, or of wrongfully rejected can¬ 
didates, and guidance which merely recommends the most suitable 
jobs for single individuals, are patchwork measures which may actu¬ 
ally increase vocational maladjustment in the long run (86-89,92-94). 

3.2. The value of vodational and educational classification 
schemes lies not merely ip the closer co-ordination of human 
capacities with job or school requirements, but also in their effects 
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on morale. Bad selection leads to lack of confidence Tjetween 
employers and employees, soldiers and officers, parents and schools, 
and possibly to neurosis. Selection that appears good has the 
opposite effects even if, judged by scientific standards, it is far from 
efficient. To appear good it must include consideratioi; of each 
individual s interests and abilities by a sympathetic interviewer, 
and the tests or other insti-uments employed must be obviously 
relevant to the job (26-27, 40,46, 52, 66-66, 97, 265). 

3.3, The attitudes of the candidates towards a scheme are 
important, alsq, because their performance in psychological tests 
and their frankness in questionnaires or interviews are affected. It 
is unsafe to assume that tests which worked well in, say, the Army, 
would be equally effective, or that the results obtained would be' 
duplicated, in an industrial or educational context where the 
motives and opinions of the candidates differ. While this applies 
particularly to personality factors, assessments of abilities also are 
not likely to be entirely immune. The confidential nature of infor¬ 
mation obtained from tests, questionnaires or interviews should be 
respected (66, 92-94, 100-102, 218, 267,'260). 

Connection Between Classification and Training 

4.1. Though classification may often have to be based on a few 
hours’ testing and interviewing of the subjects, i.e. on a cross- 
section of their traits and abilities, it should preferably be inte¬ 
grated with training as a longitudinal process. This has the addi¬ 
tional advantage that the .training given can be adapted to the 
quality of the available trainees (92, 97-98). 

4.2. Both classification and training are closely bound up with 
job simplification. The industrial psychologist, or time and motion 
study expert, can often so reduce the complexity of a job that a 
much larger proportion of candidates can manage it with simpler' 
training (98). 

4.3. Re-classification of subjects who have failed in a first job 
demands particularly skilful handling because of the blow to their 
morale. The .common assumption that there is a right job for each 
individual, and that jf he fails in one thing he possesses abilities 
which will make him sdccessful at another equally complex job of 
, a different type, may be true in some cases. More often it is neces¬ 
sary to recommend a lower-grade employment whose demands on 
the individual will be smaller (32-33, 48-49,149). 
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Further Characteristics of Classification Schemes 

6.1, A well-devised classification scheme does not depend on 
the supply of candidates falling short of, or exceeding, the number 
of jobs. It should also not usually involve great expenditure either 
of the time of highly-qualified personnel, or in the provision of 
elaborate tests and very detailed job analyses (85-89). 

6.2. At the same time effective classifickion schemes cannot be 
run by amateur psychologists. The planning, administration and 
documentation, the proper choice of testing or other methods, the 
interpretation of their results, and in particular investigations into 
the worth of the methods, are inevitably highly technical matters 
(Chapters VI, VII, X). 

Proving the Value of Classification Methods 

6.1. No selection method (e.g. a test or examination), nor datum 
used in selection (e.g. an item from previous history, or an impres¬ 
sion made in an interview) should be regarded as predictive of 
/vbcational or educational suitability without experimental proof. 
That a method works well in the ‘‘experience’* of the user and 
seems to him to select suitable candidates shows that his attitudes 
to the method are favourable, but does not prove scientifically that 
it is a valid instrument (99-100,119-120,124). 

6.2. It is not worth the psychologist’s while to embark on a 
classification scheme unless he can be assured of the availability of 
a trustworthy and meaningful oiterion of success among the 
people he is to classify, by comparison with which the value of his 
methods can be gauged. Gradings of proficiency by a single super¬ 
visor or teacher are not trustworthy. Written examinations at the 
end of a training course do not constitute a meaningful criterion of 
competence at some practical job. Nevertheless, unsatisfactory 
criteria can be much improved by the technical methods developed 
in the Forces (106-112). 

6.3. It is seldom a straightforward or simple matter to demon¬ 
strate the value of a classification procedure as a whole, or of its 
component instruments. The main difficulties include untrust¬ 
worthiness of criteria, lack of sufficient cases, disturbing influences 
such as variations in standards of marking or grading, and “selec¬ 
tivity”—^i.e. the fact that it is only the candidates actually selected 
whose proficiency at the work can be assessed. Owing to this last 
factor, it is more difficult to prove the value of a procedure the 
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more efficient that procedure is; also, any method or datum 
actually used in .selecting is liable to appear less valid than methods 
not already used (107-108, 111-112, 113-117, 122-123). 

6.4. Very big numbers of subjects are essential for most investi¬ 
gations into the value of classification methods. Several parallel 
studies of moderate-sized groups are preferable to a single study 
of one large group, because the results from different groups are 
liable to vary so widely (112-113). 

6.6. In spite of all these difficulties the value of classification 
schemes has been proved beyond all doubt not only in coimection 
with such jobs as naval, Army and A.T.S. mechanics and clerks, 
but also among R.A.F. aircrew. Less striking, but none the less 
appreciable success, was demonstrated in officer selection. There 
can be few jobs that call for more complex traits and abilities than 
these, and few, therefore, where psychological methods are unlikely 
to bring about improved selection (119-127, V, XVI). 

The Stc^ Needed in Classification Schemes 

7.1. The greater part of the day-to-day classification work can 
be effectively carried out 'by non-psychologists such as teachers 
and personnel officials. They require not merely careful selection 
and training but constant supervision. So great are the variations 
in the capacities of such persons that the value of their work should 
be followed up individually. Women are at least as suitable in most 
fields as men of equal intelligence and education, even when the 
psychological work is largely concerned with adult men (27-29, 
41, 44r-46, 73-74, 100-102, 182-163). 

7.2. The training of these persons is best done by apprentice¬ 
ship to a qualified psychologist. Lectures and reading can play 
some part, but practice in the actual application of psychological 
methods, and practical tests of skill before tliey work on their own, 
are essential. Work that largely involyes interviewing requires if 
anything more training than work consisting chiefly of test admin¬ 
istration (27-29, 41, 44r-46, 73-74, 100-102). 

7.3. Such persons constitute an invaluable element in aselection, 
or other psychological, scheme because any human affair requires 
tactful bandling by a human being. But so great is the fallibility 
of human intuition and “commonsense” that they should be pro¬ 
vided with as accurate tools as possible, and encouraged to use 
them in preference to their hunches. Only the exceptional human 
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being has acumen superior to impersonal, scientifically validated, 
procedures (99-100, 162-104). 

Provision of Information About Jobs 

8.1. If candidates are supplied with accurate and easily-grasped 
information about available jobs, many of them arc capable of a 
jarge measure of self-guidance. This not only saves the inter¬ 
viewer’s time but provides him or her with an indication of tlie 
keenness of their interests. Interviewers also must be given infor¬ 
mation in a standard, usable form (31, 40, 90-91). 

8.2. The job descriptions needed for this purpose, or as a basis 
for test construction, should be prepared by experts and industrial 
psychologists in collaboration, and should cover all the relevant 
physical and social conditions—^not merely the operations per¬ 
formed. As little use as possible should be made of terms referring 
to generalised traits or abilities (concentration, dexterity, and the 
like) (47, 49, 90-91, Q,2-93, 167, 173). 

Collection of Biographical Data 

9.1. The central feature of classification procedure is the bio¬ 
graphical questionnaire or cumulative record form where all the 
relevant information about previous history, test scores, interview 
judgments and recommendations, are brought together. Precau¬ 
tions in drawing up such a form are summarised on pp. 134-136 
(29, 43, 91-92). 

9.2. Such items as previous occupational experience, age, educa¬ 
tion, evidence of leadership and of a responsible attitude to work, 
and interesijs in particular fields, often have greater value in pre¬ 
dicting occupational success than aptitude tests. As, however, it is 
difficult to elicit reliable information about these matters by 
written questionnaire alone, they must either be checked up in 
interview or, better, measured objectively by suitable tests (96-96, 
136-142). 

Intervi^ng 

10.1. Conclusions relating to the conduct of the employment or 
diagnostic interview are summarised on pp. 143-146 (cf. also 
31-32, 96-97, 146-162). 

10.2. Different interviewers of the same candidates are found to 
arrive at widely differing conclusions, unless they are exceptionally 
thoroughly trained and have a clear and agreed conception as to 
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what they are looking for. The interview judgments of psycho¬ 
logists and psychiatrists may have considerable value, though even 
these are very variable. But the average interviewer’s s ummin g up 
of past history and his judgments of personality qualities tend to 
be so subjective that they detract from, rather than add to, the 
accuracy of predictions based on properly weighted combinations 
of test scores and other objective data (162-164). 

10.3. At the same time this, “psychometric” approach to the 
assessment of vocational suitability tends to be too rigid and 
mechanical, and may require immense technical resources. In 
situations where the numbers are small, or where job requirements 
alter rapidly, it is quite impracticable. Its impersonality also makes 
it much less acceptable. The interview is, therefore, essential on the 
grounds of flexibility and humanity, in spite of its inaccuracy 
(94-97, 163-164). 

Psychological Tests and Test Standards 

11.1, What a psychological test appears to measure (its face 
value) is important from tbe >standpoint of acceptability, but 
throws little light on what it actually does mcMure. Tests should 
not( be regarded as eliciting hypothetical traits or abilities, but 
should be directly related to job proficiencies by follow-up research 
and thcjir content investigated by analysis of the main common 
factors running through them and other tests (164-168). 

11.2. Absolute standards or minimum (critical), test scores 
should not be laid down for acceptance for a job, or a certain type 
of schooling, since the state of supply and demand must be taken 
into account. The psychologist does not say that this man could 
never be successful at such and such work, but that there is a high 
or a low probability of his success. Follow-up information should 
show the optimum ranges of scores on the most relevaint tests or 
groups of tests, and of other data such as age, for each job under 
consideration (86-89, 177-180, 278-279). 

Types of Tests Most Statable for Classification 

12.1. A standard battery of g^oup paper-and-pencil tests, con- 
.taining verbal, diagrammatic or ■ pictorial material, will by itself 
cover a good deal of the groimd in vocational classification Schemes. 
Tests of the information and trade knowledge type should also-be 
available, and for certain jobs supplementary analogous or 
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“miniature situation” tests are desirable. But our evidence indi¬ 
cates that it is not usually necessary to devise an elaborate battery 
of practical tests for measuring the aptitudes presumed to under¬ 
lie each job, except in applying selection schemes to rather 
homogeneous samples of candidates (108-172, XII). 

12.2. In constructing a new test, both the test as a whole and its 
component items should be shown to contribute to the prediction 
of vocational suitability over and above the standard battery. 
Economic considerations such as availability of materials and time, 
personnel for constructing, applying and scoring, etc., must be 
weighed against this contribution (172-176, 243). 

12.3. Tests must be adapted to the population concerned. For 
example, creative-response items arc preferable to selective- 
response in this country, and time limits should usually be gener¬ 
ous. Tests of the abstraction and clerical type, and oral directions 
(provided delivery can be standardised) appear to work better 
among average and sub-average adults than the more conventional 
analogies, classification, reasoning problems, etc. Tests should not 
be too numerous, and moat of them can and should be quite short 
—^iiot more than fifteen minutes (170, 176, 220-224). 

The Value of Certain Tests of Abilities 

13.1. Objective spelling tests tend to be more reliable and are as 
valid as dictation tests of the same length, among adults. Short 
answer arithmetic and mathematics tests have wider predictive 
•value than lengthy arithmetical reasoning examinations (226-229). 

13.2. Such mathematics tests and verbal tests of the clerical type 
measure intelligence and educability in adults to a greater extent 
than actual education. Hence they were found to be the most useful 
tests of any in most Army and Navy jobs where rapid trainability 
and adaptability were needed (213-219, 226-229). 

13.3. Mechanical comprehension, models and assembly tests, 
tests of spatial judgment and performance tests, tend to measure 
general practical intelligence in adult males rather than suitability 
for specifically mechanical occupations; (information tests are the 
most predictive for this purpose.) Such tests are more successful 
among adolescents and women, whose previous mechanical experi¬ 
ence is less extensive and varied (236-246). 

13.4. A combination of such mechanical and spatial tests with 
verbal and educational tests provides a better measure of all-round 
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adult potentiality than, do tests which aim to measure pure intel¬ 
ligence by non-verbal material (206, 234-236). 

Factors It^uenctTtg Test Performance 

14.1. Test performance is much affected by the recency of 
schooling and by the amount of intellectual exercise people get in 
their work. Thus practical abilities tend to rise during aflnlpsccnce, 
but educational ones sink markedly except among those who 
receive further education, or who are engaged in intellectual jobs. 
General intelligence also, as measured by psychological tests, 
starts to decline even before the age of 20 among those who make 
little “use” of their “brains” (188-196). 

14.2. Neither menstruation nor colds were found to have any 
consistent effects on the test performances of women, though 
evidence was obtained of improvement on certain intelligence testa 
among men of poor physique as a result of ph 3 r 8 ical training courses 
(201-203). 

14.3. An average rise in psychological test scores of about 5 per 
cent. (6 I.Q. points) is likely to occur among unsophisticated 
testees from having taken the same test before. Previous experience 
of other tests md general schooling or the taking of examinations 
appear to produce smaller, but appreciable, increases. Further 
practice produces progressively smaller eflFects. Hence by giving 
fore-exercises or one or two preliminary tests, the differences 
between testees with different amounts of previous experience can 
be much reduced. So far as the evidence goes, practice effects do 
not diminish with lapse of time. They are smaller in straight¬ 
forward tests with ample' time limits than in selective'-response 
tests with complicated instructions and unusual content. It follows 
that the dangers of leakages t>f tests, or of previous coaching on 
them, are very considerable (182-187). 

Assessment of Personality Qualities 

16.1. War Office Selection Boards have shovra the superiority of 
thorough study of candidates by several trained judges to ordinary 
interview methods of selection, and this aspect of W.O.S.B. pro¬ 
cedure is worth applying to the selection of managers, admin¬ 
istrative Ciwl Servants, and in other high-grade occupations. The 
inclusion of exercises, analogous to the job for which selection ia 
fairing place, helps to create confidence in the scheme, but does not 
necessarily improve its scientific worth. Observations of groups of 
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candidates performing these exercises, and judgments of person¬ 
ality based on themi may be fully as subjective and unreliable as 
interview diagnos^, Far more investigation is needed into the best 
way of standardising the methods and making them more objective 
(22-23,62-66,122-127), 

16 . 2 . Though a definitebegmninghasbeenmadcintheobjcctivc i 

measurement of important personality factors such as stability v. 
neuroticism, and extraversion ?i. introversion, most of the tests are 
too elaborate and time-consuming for large-scale application. 
Useful results are nevertheless obtainable with simple personality 
and interest questionnaire tests, provided that they are carefully 
devised in the light of the attitudes of the testees (261-264, 
266-260). 

16.3, Personality tests of the projection type, and indirect tests 
based on observation of reactions to stress, may give useful indica¬ 
tions to trained psychologists, but arc too subjective for general 
use, Objectively scored projection tests deserve further exploration. 
Among the most valuable measures of suitability for promotion to 
posts of responsibility and leadership are ratings or “nominations” 
by a man’s fellow-workers (264-267). 
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APPENDIX I 


ABBREVIATIONS EMPLOYED IN THE TEXT OR IN 

APPENDIX II 

A.A. Anti-aircraft (Army) 

A.C.R.C. Air Crew Reception Centre ‘ 

A.C.S.B. Aviation Candidates Selection Bowd ' 

A.S.C. Army Selection Centre. 

A.T.S. Auxiliary Territorial Service (Army), now Women’s 
Royal Army Corps 

Clerks P.S. Personnel Selection Clerks (R.A.F.) 

C.O. Commanding Officer 

C.P.O. Chief Petty Officer 

C.R.C. Combined Recruiting Centre 

C.S.C. Combined Selection Centre (R.A.F.) 

C. W. Commission and Warrant (naval officer) 

D. S.P. Directorate for Selection of Personnel (Army) 

E. E. Entry Establishment (naval recruits) 

E.F.T.S. Elementary Flying Training School 

E.M,S. Emergency Medical Service 

E.S.C. Extended Service Commission (R.A.F.) 

g General intellectual factor in ability 

G.S. General Service (Army) 

G. V.K. R.A.F. Testa of g, v and k 

H. O. Hostilities Only (recruits to the Navy under the 

Wartime Conscription Act) 

I. H.R.B. Industrial Health Research Board 

I.T.W. Initial Training Wmg (R.A.F, recruits) 

k Visuo-spatial factor in ability 

k ; m Spatial-mechanical factor 

M. T.O. Military Testing Officer (War Office Selection 

Boards) 

N. I.I.P. National Institute of Industrial Psychology 

N. S.A. National Service Act (recruits under the post-war 

conscription Act) 

O. C.T.U. Officer Cadet Training Unit 
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O.I.R. Officer Intelligence Rating 

O. T.U. Operational Training Unit (R.A.F.) 

P. S.O. Personnel Selection Officer 

P.T.C. Primaiy Training Centre (army recruits) 
r Correlation coefficient 

R.A.C. Royal Armoured Corps 

R.A.F.S.B. Royal Air Force Selection Board (officers) 

RA.M.C. Royal Army Medical Corps 

RA.S.C. Royal Army Service Corps 

R.C.A.F. Royal Canadian Air Force 

R,E, Royal Engineers 

R.E.M.E, Royal Electrical and Mechanical Engineers 

R.N,V.R. Royal Naval Volunteer Reserve 

R. T.C. Research and Training Centre (War Office Selection 

Boards) 

S. F.T.S, Service Flying Training School 

S.G. Selection Group or Grade (psychological test stan¬ 
dard) 

S.P, Senior Psychologist to the Admiralty 

S. P. Test Senior Psychologist, or Selection of Personnel, Test 

T2 Total score on standard battery of naval tests 

T. R. Training Recommaidation (army recruits) 

U. S,A.A.F. United States Array Air Force 

f Verbal factor in ability 

V: ed Verbal and educational factor 
W.A.A.F. Women’s Auxiliary Air Force, now Women’s Royal 
Air Force 

W.O.S.B. War Office Selection Board (officers) 

W.R.N.S. Women’s Royal Naval Service 
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THE MAIN PSYCHOLOGICAL TESTS USED IN THE 

FORCES 

Number or Abbreviated Title. Different series of numbers were 
used by Admiralty and War Office psychologists for most of their 
tests, but as these seldom conflicted the naval and army tests are 
listed together below. R.A.F. tests, to which short titles were 
given, appear at the end. 

Use. N=used in the Royal Navy, A=the Army, a=the A.T.S., 
F=theR.A.F. 

Time. This column gives the working time, in minutes, apart 
from instructions. 

Reliability. These coefficients are mostly based on retesting after 
1 to 6 months, and refer to representative samples of the popula¬ 
tion in the case of naval and army tests. R.A.F. samples were 
usually more highly selected. Italicised coefficients were obtained 
by the Kuder-Richardson or the split-half technique, 
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Author or No. of Time Seli- 

Title Source Use Items {mins.) ability 
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64, 126, 180 
Fressey, S. L., 268 
Price. E. J. J., 238, 308 
Prisoners of war, 49, 63-4, 73 n. 
Project method, 90 
Psychiatrists, in Army selection, 40, 
45, 47, 49, 161, 161, 226, 239, 
251, 266-7 
in child guidance, 20 
in naval selection, 31, 33, 160, 166, 
167 

in personnel selection, 153, 282, 
287 

in R.A.F. selection, 161,166,161-2 
in U.S. CoBBtguerd selection, 163 
in W.O.S.B.s, 63, 66-80, 62, 64, 
126, 169-61 

suspicions of, 33, 64, 68, 282 
Psychoanalysis, 16, 33,68-0,73,93-4, 
282 

Psychologists at W.O.S.B.s, 64-8, 
100-11, 120, 169-00 
Psychology, suspicions of, 11, 38, 42, 
64r-e, 95, 267, 281-2 


Public opinion surveys, 24, 38, 144, 
162, 164 


Qualification form, 43,46,131-3,138- 
9, 160 

Questionnaires, 


■96-7, 131-42, 144, 149, 102-3 
160, 171, 173, 210, 283, 286; see 
also Qualification form 
follow-up, 109-10 
naval, 20, 31, 35, 131, 136, 138, 
140-2, 160, 160, 210 
personality, see Personality inven¬ 
tories 


R.A.F., 77 

scoring of, 06, 100, 112, 136-42, 
163-6, 162-4, 174, 284, 286 
W.O.S.B., 66, 67, 134 
Quetekt, L. A. J., 16 


R.A.F. College, Craiiwell, 80 
R.A.P, Selection Boards, 80, 290. 
RA.F. Selection Centres, 78-0 
Rapaport, D., 20, 808 
Ratings for proficiency, 48, 107, 109- 
11, 213, 216-7, 267-9, 284; see 
also Merit ratings; Personality 
Raven, J. C., 226, 234, 286, 308-9 
Reaction time, 14; see also Tests 
Recruiting Centres, selection at, 26-7, 
20-30, 40, 43, 47, 77-8,137,160, 
165, 157, 184, 186, 107 198, 

242, 206, 208-0 
Rees, J. R., 45, 300 
Rehabilitation, 40, 04, 06 
Reid, D. D., 102 

Reliability of examinations, 14-15, 
18, 80, 08, 11, 213 n. 
follow-up criteria, 106-7, 111-16, 
123, 126, 174, 306, 208, 217, 
248, 284 

interview judgments, 162-3, 166-0, 
162, 277, 200 

questionnaire items, 134r-7 
tests, 47, 68, 106, 100, 176, 184, 
201, 216, 226-0, 231, 236-6, 238, 

243, 246, 247, 240-60, 264, 260, 
270-1, 273, 288, 206-301 

W.O.S.B. procedures, 63, 122, 
126-7 

Research and Training Centre, 64-7, 
00-1, 03-4, 122, 207 
Research in vocational psychology, 
10, 21, 33, 36, 60, 73, 101, 284 
Rhodes, E. C., 18, 162, 306 
Richardson, M. W., 296 
Roberts, J. A. F„ 167, 236, 304 
Rodger, A., 27, 30, 40, 87-8, 90-1, 
94, 102, 118, 143, 206, 208, 
300 
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Rodger, A. G., 183, 309 
Rodger, T. F., 160 
Rogers, C. R., 110, 309 
Rollin, H. R., 161-2, 309 
Rorschach, H., 209, 253, 266-0 
Rothney, J. W. M., 182-3, 189 n„ 304 
RcTwntreea Ltd., 19, 80, 208 
Royal Armoured Corps, 44, 243 
Royal Army Medical Corps, 40 
Royal Army Service Corps, 230 
Royal Artillery, 126, 139 
Royal Canadian Air Force, 77,136,301 
Royal Electrical and Mechanical 
Engineers, 64, 67 
Royal Engineers, 64, 263 
Royal Marine Corps, 36, 64, 218-7, 
239 

Royal Naval Personnel Research 
Committee, 37 
Russell, J. T., 130, 310 

S factor, 238 
Sarbin. T. R., 163-6, 309 
Scates, D. E., 134, 306 
Science 4, 69, 72-4, 80, 299-301 
Scientific Adviser to the Air Ministry. 
72 

Scientific Adviser to the Army 
Council, 41, 61 
Scientific management, 13, 16 
Scott, I. D., 20, 96, 309 
Scottish Council for Research in Edu¬ 
cation, 18 

Sea Cadets, 138, 199-201 
Seashore, C. E., 173, 209 
Second Sea Lord, 27 
Selection for secondary education, 1.8, 
87, 93, 104, 116, 180, 204 
technical education, 180, 207 
university education, 18, 62-3, 171, 
204, 206, 209 

Selection Grades, 43, 176-7, 179-80, 
186, 197-8, 212, 216-17 
Selection in Allied Armies, 60 
Selection of administrators, 62-3, 
209-10 

African and Indian recruits, 60,240 
air fitters, 36, 121-2, 138, 199,.210, 
214 

air gunners, 69, 76, 242, 273-6 
air mechanics, 36, 121-2, 186, 199, 

210, 214 ' 

aircraft finishers, 280 
aircfafthanda, 279 
aircrew, 17, 22, 68-78, 161, 184, 
198-9, 210-11, 226, 229-30,240- 
2, 248-9, 264, 260-78, 286, 299- 
301 

anti-aircraft personnel, 47, 209, 

211, 217-18, 236, 239, 248 


Selection of— eontd. 
anti-submarine (Asdic) personnel, 
17, 22, 26-6, 32, 34, 38, 172-3, 
210-11, 224, 231-2, 247, 298 
apprentices, 36-6, 60, 78-9, 168. 
206-7, 210, 230, 243, 246-6, 297, 
290-300 


armoured car crews, 270 

armourers, 245, 279 

artificers. 26, 36, 193-4, 210-11. 

214,246,207 
assembly workers, 206 
balloon operators, 270 
batmen, 279 
bombers, see aircrew 
builders, 44, 192, 246 
business executives, 206, 210 
carpenters, 246, 280 
cinema projectionists, 210, 246 
Civil Servants, 64, 209, 289 
clerks, 44, 70, 120, 139, 172, 170, 
102, 206, 200, 213-14, 232, 260- 
eo, 278-80, 285, 296 
coastal forces ratings, 136,199, 217, 
239 


coders, 180, 199 

communications ratings, 137-8, 
199, 213-14 
compositors, 206 

cooks, 28, 139, 198-9, 210-11, 279 
cotton-mill machine &ers, 206 
deaf children, 261 
divers, 48 

draughtswomen, 239 
drivers, 16-17, 19, 44, 120, 138-9, 
168, 102, 209, 214, 224, 230-1, 
247, 269-60, 270, 206 
electrical meciianicB, 26, 28, 36, 
167-8, 210-12, 214, 228, 246 
electrical workers, 192, 246, 270- 
80, 297 

engineer officers, 64, 67 211-12, 
236, 298 

engmeers, 146, 170, 206, 243 
equipment assistants, 270-80 
fabric workers, 279 
farm workers, 192 
fitters, 207, 243, 246, 301 
flight engineers, 242, 273-6 
glider pilots, 290-301 
ground observers, 279 
groundsmen, 279 

gunnery personnel, 26, 34, 44, 176, 
210-11, 213-14, 218, 247-8, 279 
infantry, 130, 213-16, 239, 243 
instructorSi 32, 36, 61, 81, 210, 261, 
270 

Interviewers, 144, 210, 285 
labourers, 46, 192 
layers, see gunnery personnel 
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linemen, 214 

machine operators, 102, 206-7, 243 
managers, 06, 164, 267, 280 
mates, 102 
mechanicians, 36' 
mechanics, 32, 44, 60, 70, 88, 120- 
2, 136-7, 140, 148, 171, 100, 
200, 213-16, 228, 235, 230, 
243-4,200,278,280,288,200-7 
engine room, 210-11, 214 
flight, 270—80 

instrument, 214, 245, 260, 270 
motor, 210-11, 214, 245 
ordnance, 210 
see also air, electrical, radio 
metal workers, 102, 206, 208, 260 
meteorologists, 270 
motor boat crew, 270 
motormen, 206; see also drivers 
musicians, 200 

N.C.O.a, 32, 61,161,161, 211,218, 
230 

navigators, see aircrew 
nurses, 206 

oflScera, see Officer selection 
orderlies, 44, 270 
P.S.O.S. 41, 210 
packers, 206 
parachutists, 101 
photographers, 210-11, 270-80 
pilots, see aircrew 
police, 63, 162, 169, 270 
precision workers, 102 
professional workers, 06, 206, 209, 
260 

psychological warfare staff, 03 
K.A.F. ground trades, 08, 70-2, 
78-9, 108-9, 278-80 
radar operators, 25, 32, 34, 30, 168, 
173, 176, 210-11, 230, 236, 230, 
244-6, 240, 279-80, 208 
radio operators, 22, 170-80 
radio-wireless mechanics, 36, 71, 
141-2, 167-8, 108-9, 210-11, 
■“ 213-16,243,240,279-80,297,209 

recruiting assistants, 27, 210 
research workers, 210 
retail tradesmen, 102 
riflemen, 44, 214 

safety 'equipment ratings, 138, 
168-9, 210-11 
salesmen, 163, 210 
school-teachers, 206, 210 
seamen, 32, 138, 140-1, 107-0, 
210-16, 228-9 
sick berth attendants, 211 
signallers and signalmen, 44, 137, 
. 210-11, 214, 210-17, 236, 239, 
244, 240-60, 266-7, 300 


Selection of—rcontd. 
special operators, 120, 260 
specialists, 17, 34, 60, 62-3, 06 
stenographers, 48 
stewards, 190, 210-12 
stokers, 28, 38, 140-1, 108-9. 210- 
12, 214 

storemen, 44, 214 
supply assistants, 210-11, 214, 
232 , ’ 

switchboard operators, 130 
tank drivers, 22 

telegraphist air gunners, 36-6, 137, 
210-U 

telegraphists, 17, 28, 32, 137-8, 
108-0, 206, 200-11, 214, 260 
telephonists, 17, 176, 270-80 
teleprinter operators, 279 
toolmakers, 206 

torpedo ratings, 26-6, 32, 34, 36, 
210-11, 218, 279 

tradesmen,. 18, 26, 30, 30, 44-6, 70, 
100, 120-2, 103^, 242 
turners, 207 

W./T. slip readers, 270-80 
wireless operators, 60-70, 120, 273- 
4, 270-80, 206 
woodworkers, 102 
writers, 28, 32, 100, 210-12, 214, 

232, 276, 296-8 

Selection ratio, 86-7, 128-30, 178, 
100, 216-17 

Selection work abroad, 48, 60, 63, 240 
Selectivity, 116-17, 122-3, 126, 

137-8, 158-0, 108, 206, 212-13, 
231, 244, 240, 250, 274-6, 278, 
280, 284U6, 206 

Senior Psychologist's Department, 
27-38, 290-8 

Senior Training Corps, 100-200 
Service Flying Training School, 262- 
4, 271 

Shakow, D., 200, 208 

Shipley, W. C., 31, 66, 221, 226-6, 

233, 200, 300 

Shuttleworth, C. W,, 200, 309 
Slater, E., 260, 300 
Slater, F., 236-7, 266, 258, 260, 302, 
300 

Smith, P.. 87, 118, 206, 243, 306 
South African Air Force, 240, 264 
Spearman, C. E., 16, 222, 230 
Specjal Entry Cadets (R.N,), 34 
Special Service ratings, 26, 36 
Stanbridge, R. N,, 10, 300 
Standardisation, see Teats 
Stanines, 177 n., 274 
Statistical methods in psychology, 6, 
18, 16, 38, 40, 73-4, 100, 104r-6, 
108 n„ 113, 178, 180, 284 
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Statistical significance, lOfi, 136, 192. 

201-3, 274 ; * 

Stenqiiist, J. L., 206-7, 242 
Stephenson, W., 61, 68, 70-4, 78, 
226, 236-8, 209-300 
Stott, M. B., 106, 117, 309-10 
Sttaker, D., 29, 107, 310 
Strang, R., 143, 310 
Strong, E. K., 246, 269 
Student counselling, 13, 10, 89, 110 
Stuit, D. B., 168, 260 n., 310 
Summed S.G.s, 43-4, 101, 212 
Sutherland, J. D., 40, 64, 310 
Sutton E.M.S. Hospital, 268 
Swineford, F., 207, 306 
Symonds, P. M., 134, 143, 310 

Tavistock Clinic, 12 ft., 64, 93 
Taylor, F. W.. 16 
Taylor, H. C., 130, 310 
Temperament, see Personality 
Tennessee Valley Authority, 207 
Terman, L. M., 177, 186, 240 
Tests, acceptability of, 96, 97, 163-4, 
170, 180, 221, 256, 287 
analogue, 171-3 
analytic, 171-3, 200, 230-1 
apparatus, 160-70, 174, 249 
aptitude, 13, 170, 260 
compared with clinical judgments, 
163-6, 167-00, 162-4, 287 
construction of, 40, 49, 73,-100, 
183-4, 172-6, 286, 288 
educational attainments, 13, 16, 18, 
26, 67, 74, 166, 168-9, 176, 180, 
183, 180, 193-6, 190-200, 206, 
213-16, 234, 238-9, 243, 246, 
261, 273, 276, 288-9 . 
expendable, 170 

face value of, ,167, 241, 266, 283, 
287 

, group V. individual, 160 ^ 

intelligence, see Intelligence tests 
interests, see Interests, tests of 
inventive v. selective items, 170, 
176, 187, 221-2, 229, 237, 242, 
281, 2SS 

limitations of, 94-6, 138-9, 163-4, 
168-7; 218-19, 281, 287 
main characteristics of, 166-72 
mechanical scoring of, 18, 109-70, 
174 

miniature situation, 172, 176, 209, 
248, 288 

norms, see standardisation of 
origin and development of, 13-22 
paper-and-pencil v. practical, 47 
109-70, 207, 243-4,248^9,287-8 
pass-marks on, see standards on 
'petsonality, see Personality tests 


Teats— eonid, 

practicability of, 173-4, 208, 248 
266, 269, 288 ’ ’ 

relevance of in interviwing, 144. 

147-9, 162 ' 

reliability, see Reliability, tests 
screening, 167, 260; see ako 
Neurotic tendencies 
standardisation of, 16, 40, 61, 66 
71, 166, 170^80, 187, 190, 196 ' 
standards on, 29-30, 44, 49, 71. 

129-30, 177-80, 206, 278, 287 
trade, see Trade tests 
validity, see Validation of teats 
weighting of, 46 96, 104-6,112, 

164, 174, 274 

work-sample, 62, 76, 171-2, 209, 
224, 248 

Tests, titles; see also Selection of ad¬ 
ministrators, etc. 

Al, 299 
AT2, 209 

AH4, 184, 220, 231, 208 
AH6, 205 

Abstraction, 30-1, 66, 180, 186-6, 
193-6, 209, 212, 214, 217-18, 
221-8, 226, 232-3, 240, 244, 
246, 288, 206-8 

Agility, 43, 47, 172,/185, 100, 201, 
214, 247-8, 266, 267 
Aircrew Aptitude Test Battery, 74, 
70-8, 273-6 

' arithmetic, see Tests, mathematics 
Army Alpha, 17, 196, 206, 214, 
224 

Army Beta, 17, 238 
auditory, 43, 100, 173, 231-2, 260- 
1, 207 

Bartlett, 60, 74 

Bennett, see ' Tests, mechanical 
comprehension 

Bennett-Slater questionnaire, 268, 
260 

Binet, 13, 16, 17, 18, 177, 186 
, Body Sway, 263 
Carl Hollow Square, 66 
Cattell Scale III, 196, 205 
clerical, 43. 76, 168, 172, 178-0, 
ise-e, 190, 202,212,214-6, 217- 
8, 222-4, 232-3, 260, 266, 288, 
200, 208 

Code Aptitude, see Teats, Morse 
Aptitude 

colour-blindness, 20 
completion, 16 

Contrbl of Velocity, 248—0, 273, 301 
, Co-ord.-A, B, C, D, 248-0. 273, 
300-1 . 

, co-ordination, see manual dexterity 
Cornell Selectee Index, 266, 268 



32S 


INDEX 


Teats, titles— eimtd. 

Cox’s Mechanical, 207; see also 
Testa, mechanical models 
Craik Predictor, 248 
Cube Construction, 32, 166, 206-6, 
230 

Cursive Miniature Situation, 266 
dark adaptation, see night vision 
deterioration, 20, 221, 236 
dictation, see Tests, spelling 
Directions, 70, 224,230-1, 246, 288 
Dominoes, 236, 206 
dotting machine, 264-6 ^ 

E.M.T., 60 
Echo, 240 

electrical information, 141, 168, 
170, 103-4, 242-3, 246, 260, 207 
English, 30, 226, 208 
P.H.3, F.H.R., 40, 161, 220, 208 
Figure Construction, 207,233,237- 
8, 240, 297 
Fitter I Maths., 301 
flap, 249, 264-6 
Formboards, 206, 208 
Form Relations, 206, 237 
G.I.T., 60 

G.V.K., 70-1, 186, 197, 226, 236, 
238, 273, 278-80, 200 
G4, 236, 273, 299 
G6, 299 
Gen. A, 273 
Gen. B, 226, 273, 200 
Gen. C, 273, 200 
Gen. D, 242, 273, 200 
General Classiiication, 206, 216 
general knowledge, 69, 200 
geography, 67 

graph reading, 168, 230, 244-6, 
249, 208 

Group Teat 26, 222, 232-3, 208 
Group Teat 33, 160, 206-8, 220, 
233, 298 

Group Test 70/1, 236, 244-6, 208 
Group Test 70/23, 233, 236, 244^6, 
298 

Group Test 80, 207, 230, 208 
hearing, see Tests, auditory 
history, 67 

Ins-A, B, 230, 273, 300 
Instructions, 43, 46 «., 222-4, 232- 
3, 236, 243, 246, 261, 268-6, 290 
K6, 238, 273, 200 
Kent-Shakow, see Testa, Form- 
boards 

Kohs Bloclr Design, S2-3, 60, 104, 
230-40, 246, 206 
Kuder Preference Record, 260 
layers, 248 

leaderleas groups, 34, 63, 61-3, 
64r-6, 126 


Tests, titles— contd. 

Link Trainer, see Tests, Visual 
Link 

literacy, 131, 226 

Livingstone Hexagon, see night 
vision 

MacQuarrie mechanical, 208, 23.? 
manual dexterity, 107-8, 206, 207- 
0, 248, 263, 266, 273, 301 
Mat-A, B, C, D, E, F, 229, 273, 
290-300 

mathematics, 30-1, 43, 46 n,, 48, 
64, 67, 60-70, 141, 108, 170, 
184-6, 189-01, 103-6, 100-201, 
206-7, 212-8, 228-9, 232-4, 243- 
6,240-61, 273, 278, 288, 290-301 
Matrices, see Tests, Progressive 
Matrices 

Mec (A.T.S.), 203, 242, 261, 207 
Mec-A, B, C, 242, 273, 300 
mec^nical, 10, 35, 60, 88, 106, 
156, 168-70, 174, 180, 184r-6, 
213-16, 234r'6, 238-0,'208 
assembly, 43, 67, 169, 186, 100, 
201, 203, 206-7, 214, 242-3, 
246, 288, 206 

comprehension, 22, 30-1, 40, 43, 
46 n„ 176-7, 170, 186, 190-1, 
103-4, 201, 207, 212-14, 217- 
18, 230-1, 241-0, 273, 288, 
206, 208, 300 

information, 30, 67, 168, 170-1, 
103-4, 207, 216, 242-3, 240,, 
273, 287-8, 207, 200-300 
models, 193-4, 206-7, 243, 240, 
288, 207, 200 

Memory for Designs, 193, 206-7, 
233, 237, 244, 240, 207 
Messages, 43, 100 
Mill Hill Vocabulary, 60, 226 
Minnesota, see Tests, Formboards 
and Paper Formboard 
Moorrees, see Tests, Formboards 
Moray House, 18, 183 
Morse Aptitude, 43, 70, 169, 172, 
176-7, 186, 190, 249-60, 273, 
206, 300 

Morse Receiving, 137, 260 n. 
Murray, see Tests, Thematic Ap¬ 
perception 

musical talent, 173, 209 

N. D.R.C. Inventory, 268 
Nav(R), 301 

night vision, 17, 176, 209, 263, 266 
nominations, 267, 200 
numerical, see Tests, rnathematics 

O. I.R., 66, 160 
O.R.1 Index, 46, 201 
Oakley, see Tests, Formboards 
Obs-A, B, C, 240-1, 273, 300 
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Test^ titles— eontd. 

I O’Connor Tweezer Dexterity, 263 
O’Connor Wiggly Blocks, 206, 
-.208 

orientation, 233, 240, 297 > 

oscilloscope reading, 244-6, 208 
Paper Formboard, 207, 237-8 
perceptual, 47, 200 
performance, 17, 18, 22, 100, 194, 
200, 238-40, 288 
perseveration, 263 
persistence, 263 
personal Tempo, 263 
physical, 108; ue also Tests, 
Agility 
physics, 57 
Pointers, 240, 264 
Practical Problems, 241 
Pressey Cross-out, 268 
Progressive Matrices, 20, 40, 43-4, 
40 B., 47, 60, 160-1, 168 176- 

7, 184, 186, 100-3, 106-8, 201-3, 
212, 214r-16, 217,'223, 230-1, 
232-6, 238, 244-6, 248, 206 
projection, 20, 22, 34, 63, 66-8, 
166, 209, 266, 260, 200 
Purdue mechanical assembly, 206 
14A, RB, RE, RF, 30, 298 

RC, 30, 227, 298' 

RD, 30, 221, 208 
Radio Symbols, 246, 207 
Raven, see Teats, Mill HiUi Tests, 

Progressive Matrices 
reaction time, 22, 64, 176, 209, 
230-2 

reading, 171, 216, 224^-6, 232-3, 
207 

Rorschach Inkblots, 200, 263, 
266-6 

S.M.A.3, 70, 248, 273, 300 
S.P 296^8 

s!p’.’i 27-30 (Dictation), 226 
scale reading, 168, 230, 244-6, 249, 
273, 298, 300 

Seashore, see musical talent 
Selection Test A, 231-2, 208 
sensory-motor, 47, 70, 74, 171, 
230-4, 264, 266 

Shipley, see Tests, Abstraction; 

Tests, vocabulary 
Short Format, see Tests, N.D.R.C. 
Inventory 

shorthand, 48, 137, 172 
sorting, 66,100 

spatial judgment, 167-9, 174, 183- 
4, 187, 206-7, 213-16, 234^-40, 
243, 288, 298-0; see also Tests, 
Squares » 

Speed of Response, see Testa, 
Morse Aptitude 


ests, titles— totifd. 
spelling, 30-1, 48, 180, 186-7, 
190-1, 194-6, 214, 217-18, 224, 
226-7, 229, 232, 260-1, 266, 288, 
207-8 • . . 

Squares, 31, 43, 168, 185-6,190-1, 
193-4, 207, 212-14, 217-18, 230, 
236-7; 230-40, 244-8, 290 
standard flight, 75, 78, 81, 267-9, 
301 ’ 

Stanford-Binet, sec Tests, Binet 
Strong Vocational Interest Blank, 
246,269-60 

T2, 31, 36, 116, 140-2, 168-0, 180, 
186-6,191.193-4,197-0,212-14, 
216-18, 231-3, 244, 246, 248-9 
tapping, 70 

Terman-Merrill, 177, 186, 240 
Thematic Apperception, 22,67,66, 
166, 265 

Track Tracer, 263-4 
Trist-Miaselbrook, see Tests, Koha 
Block Design 
typewriting, 48, 137 
V4, 226, 273, 299 
V.I.T., 66, 220-1, 233, 297 
verbal ability, 43, 167-8, 185, 190- 
1, 201, 214,224-7, 246, 273, 288, 
207 

Vincent, fee Tests, mechanical 
models 

vision, 99, 166, 174-6, 200, 266 
Visual LiM, 77-8, 301 
vocabulary, 170, 189, 221, 224-6, 
233, 296; see also Tests, Mill 
Hill; Tests, Wechsler-Bellevue 
WechslerrBellevue, 38, 86,226,296 
Weigl, see Tests, sorting 
Wiggly Blocks, see Tests, O’Connor 
Wiggly Blodts 

Wirebending, 242-3, 246, 297 
Word Association, 87, 263, 266-6 
Thomson, G. H., 16 i 
Thorndike, E. L., 16, 68 
Thorne, General Sir A., 68 
Thurstone, L, L., 16, 112, 207, 209, 
288 3l0 

Tiffin, j!, 86, 89, 130, 206, 310 
Time and motion study, 38, 61, 91, 
'283 

Toronto University, 69 , 

Trade tests, 18, 107, 113-14, 121, 
149,172, 246 

oral and written, 31-2,48,137,149, 
169 , 172, 242-3, 246, 287, 297 
Training methods, studies of, 6, IS, 
16, 21, 36-7, 60-1, 71-2, 80-1, 
98 225 

link with selectioni 97"9, 283 ^ 
Training Methods department, 69,73. 
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Training recommendations, IS-O, 
2(SU 

Training Research department, 69, 
11-2,15 

, Transfers to new jobs, 32, 34, 48-9, 
63, 73 tt., 88-90, 98-9, 225, 283 
Traxlcr, A. E., 170, 310 
TrcBBury, 64 

Trist, E. L., 33, 60, 04, 226, 239-40, 
296, 304, 310 
Tuck, G. N., 40, 310 

U.S. Army, 123, 136, 213, 216, 240; 
see also American Army tests; 
American selection procedures 
U.S. Army Air Force, 76 230,240- 

2, 249, 264, 260, 273, 200-301 
U.S.-Coastguard, 163 
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